Monthly Archives: November 2018

pointnet 论文理解

这几天(上周)在看 pointnet 系列的论文,具体研究问题是如何处理点云数据。大的研究领域是机器学习如何处理3D视觉问题,在无人车和机器人领域有应用,因为 depth camera,传感器,激光雷达传回来的图片有些是 point cloud。

点云数据是用点来代表三维图像的一种数据格式,例如一张椅子的三维图。从椅子平面上抽取2400个点,每个点是 (x, y, z) 的一个三维向量,那么把这些点在三维空间里画出来,就会得到飞机的图片。这里有一个可以在线观看的例子。这张图片里的点很多,所以观看效果很逼真。如果点少的话,就会变成这样

不同于2D图片,点云数据是无序的。传统 CNN 通过滑动 kernel 的卷积操作,能提取出 local 像素点之间的依赖关系(例如 max pooling,以及 convolution 是 weighted sum of local points)。但点云数据没有清晰的 local 关系。如何解决呢?1)可以把点云数据复原到三维,把它们分割成一小格(voxel)然后进行3D卷积。2)可以把点云数据投影到二维,代表算法有MV3D 和 AVDO

pointnet 是这系列的重点论文,2016年发表。这篇论文说不需要进行这样的处理,而是直接把点云数据输入到 multi-layer perceptron 中。步骤如下:

  1. 假设 batch size = 32, 每张图片有 1024个点。那么每批次输入的数据形状为 32 * 1024 * 3
  2. 首先把这个数据中每一张图片和一个 3 * 3 的 transform matrix 相乘。整体的形状,就是 32 * 1024 * 3 的 tensor乘上 32 * 3 * 3 的 tensor,使用 tf.matmul 函数,得到的结果是 32 * 1024 * 3。物理意义是把这个点进行坐标变换。这样做的原因是这篇讲  spatial transformer 的论文,还没看。
    • 这个 transform matrix 并不是随机生成,而是一个较复杂的 T-net。它的网络结构和这个大的网络结构一样,也是将输入图片几次升维,非线性变换 + bias,然后降维,得到的矩阵。所以不同的图片虽然分享了网络参数,但是并没有乘以相同系数。
    • 为什么这个网络这么复杂?其实和GRU这类网络有异曲同工之处,都是用当下数据既生成权重,又用这个权重再次处理当下数据。
  3. 图片处理好之后的形状是  32 * 1024 * 3. 接下来是第一次升维。具体操作是,每个点的三个维度,求 weighted sum 获得一个 scalar。然后这样的操作重复64次,得到一个64维的向量。其实也就是长度为3的向量乘以 3 * 64 的矩阵。这样会得到 32 * 1024 * 64 维的数据。将这个数据加上随机的 bias 之后,做 batch normalization,形状不变。
    • 在作者的源代码中,这一步没有用拼接 64 个线性层的方法,而是用 tensorflow 中的 conv2d 实现。通过变换 channel 数来代表64个 fully connected layer。
  4. 接下来又是一次“升维”操作,和上一步一样,但是输出的维度仍然是64.
  5. 现在的数据形状是 32 * 1024 * 64.接下来是 feature transform。通过同样的网络结构,获得一个 32 * 64 * 64 的 tensor。用 tf.matmul 把每一个点再次变换,原理和2)一样。
  6. transform 之后,将图片进行升维。先升到64维,然后是128,最后是 1024.最后我们得到一个 32 * 1024 * 1024 的数据形状。
    • 在作者的源码中,数据形状是32 * 1024 * 1 * 1024,因为要使用 conv2d 操作。
  7. 接下来是关键一步:要概括总结上一步中每个图片中所有的点,得到代表全局信息的向量。这里通过 maxpooling 实现,也就是对每个 channel ,取图片中所有点中的最大值。也就是 element-wise max pooling。
    • 源码中的实现是通过tf.max_pool 函数,输入就是 32 * 1024 * 1 * 1024 的这个数据,kernel size = [1, num_points (1024), 1, 1],输出是 32 * 1 * 1 * 1024。
    • 为什么不用 sum 或者 average?这篇论文针对 max pooling 操作提出了理论性质,我还没看。
  8. 一番折腾后,终于得到了代表每张图片的全局向量。下一步是reshape ,成为形状为 32 * 1024 的一个数据块。把两个多余维度扔掉。

上面的步骤是为了学习表征。针对具体的任务,又会有剩下几个步骤:

  • classification:只需要全局表征即可。把这个表征通过三层 fully connected layers 生成长度为 40 的向量(因为一共40类),通过 softmax 层得到概率,用 cross entropy 的均值作为 loss function,通过一维逼近得到参数的最优值。
  • segmentation:作者的处理是对每个点做 classification,每个点都贴上这个全局变量后,进行分类。
  • semantic segmentation,不是很懂,略过。

这篇文章的重点在于1)将简单的网络结构直接用于点云数据,2)maxpooling 操作的性质(invariant to input orders)

Writing Python tests: doctest, unittest, and pytest

So I set myself another task the other day: writing tests for an open-source project —- a baby step towards software engineering! This post describes the tools and practices I’ve learnt.

The project I’ve committed to is here. The goal is to write tests for an experiment class, whose objective is to save state of machine learning experiments and quickly retrieve them.

There are two sets of tools to be learnt: 1) conceptually, what is a test and how to  write one? 2) practically, what tools does Python offer for writing and organizing test code, and how to use them? I will answer these two sets of questions below.

What is software testing and how to write it?

Testing is just write code to test if a software behaves as expected. If it does not, then there is a bug, and bugs are bad. Most of the online materials / books on this are kind of fluffy, describing concepts that are hard to define and unnecessarily convoluted. But really I just understand test to be like, this piece of software is supposed to output 1 when I input 3, is that what it is behaving?

There are also edge cases to be tested and taken care of accordingly.

So the real issue seems to be how to write good tests. This is a very broad topic with no definite answer, and I’ll omit discussion in this post. For this project, I’m following this guide.

Testing tools in Python 

  1. doctest
  2. unittest
  3. pytest

doctest  is packaged with Python. The usage is the following:

  • Open a Python interpreter, and interactively test a bunch of results with the function to be tested
  • Copy and paste all output from the interactive session to the docstring of that specific function
  • add a if __name__ == "__main__" line in the script to be tested so as to detect if it is imported or being executed standalone. If it is executed standalone, add one line doctest.testmod() so the doctest will run the test mode..

this module will parse the docstring to get expected results for every input. It will then compare the actual output with the expected input, and return errors if they do not match. If everything runs fine, the module will not output anything.

unittest the biggest difference with doctest is test cases are no longer defined in the original script. Instead, we now create a new script, import the old script and its methods, and test all out test cases there. The good news is now script is separated from tests, the bad news is we need to write more code.

How to use unittest module?

  • First create a new python script, import bothunittest and the original script in it. This script will be run alone, so need to specify the shebang line as well.
  • Create a unittest.TestCase class. syntax is class MyTestCase(unittest.TestCase), i.e. MyTestCase is a subclass of unittest.TestCase.
  • So the real issue is what methods we can use with unittest.TestCase. These methods are used to assert if the expected value is equal to the real reaults. Here is a link to the useful parts of its documentation. Basically, the testing process is a series of assertion processes.
  • The most naive way is just to define a bunch of assertTrue() methods in the class one by one. A more mature way is to use the setUp and tearDown methods to put all test cases into a data structure, e.g. a list, and in the test function body use a for loop to iterate through that data structure. Sample code here : https://www.python-course.eu/python3_tests.php

pytest is a third party package. It is easier to use than the unittest module, because it does not require all the class and method setups.

  • The simplest case, first create a python script exercise.py with 1) the function to be tested and a function to test that function with assert. e.g. assert func(4) == 5
  • Then, run the scrip with pytest exercise.py from command line in the folder containing the script. You do not even need to import pytest in exercise.py!
  • On the other hand, if we do not specify an argument to pytest, it will run test on every file of the form *_test.py or test_*.py in the current directory. This is a convention to discover test files. Also see this link.

Finally, adding some good resources on pytest that I just found:

  • A webinar : https://www.youtube.com/watch?v=ixqeebhUa-w
  • The accompany git repo: https://github.com/okken/pycharm_2018_Feb

Some updates on using pytest with PyCharm

  • If you want to run a single test instead of the whole script, see this SO discussion. The pipeline might be broken though.
  • Pycharm professional has a test-runner tab where you could use auto-test and it will re-run a test function after modification. Link here.
  • Add verbose option for pytest. It’s in “edit configurations” — pytest — additional argument. Specify a -v.
  • Put tests in a class so you can run them together. i.e. if they are related and always need to be run together.
  • auto test —- you can have test run continuously when you type

Happy testing!

Using ROS with Conda

For the past two weeks I’ve been playing around with ROS. All was fine until I tried to embed tensorflow and “deep learning” stuff into ROS with conda.

Things went south immediately. Pipelines broke down, what used to work no longer worked anymore. roslaunch showed `not a package` error, import in Python showed package not found even though I just pip installed it…

This post aims to clarify what happens, why my system became a mess, and how to resolve them.

The first problem with error importing modules in python is because ROS conflicts with Conda. This is because ROS sneakly changes the Python path in `~/.bash_rc` file. According to this post by Udacity NanoDegree github:

The problem here is that Python is looking for a package in /opt/ros/kinetic/lib/python2.7/dist-packages/ when it should be looking in the Anaconda installation directories. The reason for this is that with your ROS install, your .bashrc file contains the line source /opt/ros/kinetic/setup.bash, which sets your PYTHONPATH variable to the /opt/ros/kinetic/... location.

The solution is to unset your PYTHONPATH variable in your ~/.bash_rc file, right after sourcing ROS, but before sourcing your Anaconda environment. Use this command unset PYTHONPATH.

The second issue is I used many package installers to install python packages (and possibly python itself). Be careful of online tutorials my dude! I ends up having tens of python paths, scattered around different virtual environments, and things became very convoluted.

Package managers I used include:

  • pip (system wide and within conda; I have five pips …)
  • apt (did not figure out what it did until too late)
  • conda

Where do these package managers install packages?

  • pip: https://stackoverflow.com/questions/29980798/where-does-pip-install-its-packages
    • When used without conda environment, will show the base python dir
    • pip when used with virtualenv will generally install packages in the path <virtualenv_name>/lib/<python_ver>/site-packages. For example, I created a test virtualenv named venv_test with Python 2.7, and the django folder is in venv_test/lib/python2.7/site-packages/django.
  • conda will install in the environment’s python site-packages.
    • If there is no Python2 but we want a python2 package, conda seems will install it to the system-wide python site-packages

A detailed comparison about pip and apt:

pip is used to download and install packages directly from PyPI. PyPI is hosted by Python Software Foundation. It is a specialized package manager that only deals with python packages.

apt-get is used to download and install packages from Ubuntu repositories which are hosted by Canonical.

Some of the differences between installing python packages from apt-get and pip are as follows:

  • Canonical only provides packages for selected python modules. Whereas, PyPI hosts a much broader range of python modules. So, there are a lot of python modules which you won’t be able to install using apt-get.
  • Canonical only hosts a single version of any package (generally the latest or the one released in recent past). So, with apt-get we cannot decide the version of python-package that we want. pip helps us in this situation. We can install any version of the package that has previously been uploaded on PyPI. This is extremely helpful in case of conflict in dependencies.
  • apt-get installs python modules in system-wide location. We cannot just install modules in our project virtualenv. pip solves this problem for us. If we are using pip after activating the virtualenv, it is intelligent enough to only install the modules in our project virtualenv. As mentioned in previous point, if there is a version of a particular python package already installed in system-wide location, and one of our project requires an older version of the same python package, in such situations we can use virtualenv and pip to install that older version of python package without any conflicts.
  • As @Radu Rădeanu pointed out in this answer, there would generally be difference in names of packages as well. Canonical usually names Python 2 packages as python-<package_name> and Python 3 packages as python3-<package_name>. Whereas for pip we generally just need to use <package_name> for both Python 2 as well as Python3 packages.

Which one should you use:

Both apt-get and pip are mature package managers which automatically install any other package dependency while installing. You may use anyone as you like. However, if you need to install a particular version of python-package, or install the package in a virtualenv, or install a package which is only hosted on PyPI; only pip would help you solve that issue. Otherwise, if you don’t mind installing the packages in system-wide location it doesn’t really matter whether you use apt-get or pip.

Also, this link that summarizes this problem perfectly: https://stackoverflow.com/questions/14117945/too-many-different-python-versions-on-my-system-and-causing-problem

 

The third issue is I confused Python packages with ROS packages somewhere in my projects. This is more conceptual but did misguide my actions. Here are some clarifications:

ROS

  • ROS is a programming ecosystem/framework that provide middleware for modules (packages) to communicate, especially for the purpose of robotics systems
  • ROS provide distribution in Python and C++
  • ROS code is organized by packages: folders containing src, data, other libraries…
  • ROS has a $ROS_PACKAGE_PATH, which is an environment variable indicating where to find packages for ROS. For example: `echo $ROS_PACKAGE_PATH` will give the output /opt/ros/kinetic/share https://answers.ros.org/question/241528/what-does-environment-set-up-do/

Python

  • Python path: environment variables. Where to find import packages…https://realpython.com/python-modules-packages/ the module search path
  • modules: individual scripts
  • package: a collection of modules https://realpython.com/python-modules-packages/#the-dir-function
  • Given this structure, if the pkg directory resides in a location where it can be found (in one of the directories contained in sys.path), you can refer to the two modules with dot notation (pkg.mod1, pkg.mod2) and import them with the syntax you are already familiar with
  • The __init__.py file in a package will import the modules into the package namespace, so that we can directly use imported modules in the main function.

 

Finally, some useful scripts for inspection and cleaning:

  • Find all Python:  sudo find / -type f -executable -iname 'python*' -exec file -i '{}' \; | awk -F: '/x-executable; charset=binary/ {print $1}' | xargs readlink -f | sort -u | xargs -I % sh -c 'echo -n "%: "; % -V'
  • `whereis pip`   finds all pips

Use binary representation of int to represent subsets of n elements

For the past two months I spent many hours solving algorithmic problems. It’s fun, but very time-consuming. I think the majority of time is spent on trying to find accurate solutions and systematic ways for problem solving. I also spent a few weeks (!) grasping data structure fundamentals.

Anyway, this post just describes a technique I came into when reading this book , page 73, on using integers to find all possible subsets from a list of length n. Example Leetcode problem: https://leetcode.com/problems/subsets/description/

The key insights are:

  • Every int is binary represented in a machine’s memory as a sequence of 0’s and 1’s
  • The range of ints from 0 to 2**n-1, have included all possible combinations of 0’s and 1’s for a list of length n.
  • Think about every position’s value (0 or 1) as an indicator whether element at that position appears or not.
  • So for every int from 0 to 2 **n -1, we only need to extract all the positions that are 1, to get the combination of list elements this int represents
  • With bitwise operators & and shift operator <<, we can easily get whether the i-th element of an int is 1 (turned on) or 0 (turned off). Try X & (1 << i).
  • Looping through all positions from 0 to (n-1) we can get all valid bits for this int. We append the corresponding elements to the answer list corresponding to this int.
  • Finally, we collect every answer list to a big list and return.

Very smart and interesting strategy. Code snippet here: https://codepad.co/snippet/Tm6Z5ZZP

This strategy can used for other exhaustive search problems that requires looping through all possible combinations of n elements, with exponential time complexity ( 2**n). But the very fast bit manipulation makes it evades the time limit.