OpenGL Programming/Object selection

Object selection using the mouse

Introduction edit

For some applications it is important to be able to select objects on the screen with a click of the mouse. If you have a complex 3D scene with a non trivial projection (such as a perspective projection), it can be very hard to figure out which object you clicked on based only on the mouse pointer x and y coordinates. Fortunately, OpenGL has some features that make this a lot easier. We will look at two valuable techniques; the first is finding out the coordinates in our object space from the mouse coordinates and depth buffer information, the second is uniquely tagging every pixel using the stencil buffer so that we can figure out to which object it belongs.

Reading back information from the framebuffer edit

For both techniques, we will need to read back information from the framebuffer after we have drawn our 3D scene. We do not need to read back the whole framebuffer, we just want to know what is at the pixel on which we clicked with the mouse. We can register a callback with GLUT to get the mouse position whenever you click, and use the glReadPixels() function to find out what is at that pixel. To register the callback, use:

glutMouseFunc(onMouse);

The callback looks like this:

void onMouse(int button, int state, int x, int y) {
  if(state != GLUT_DOWN)
    return;

  window_width = glutGet(GLUT_WINDOW_WIDTH);
  window_height = glutGet(GLUT_WINDOW_HEIGHT);

  GLbyte color[4];
  GLfloat depth;
  GLuint index;
  
  glReadPixels(x, window_height - y - 1, 1, 1, GL_RGBA, GL_UNSIGNED_BYTE, color);
  glReadPixels(x, window_height - y - 1, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &depth);
  glReadPixels(x, window_height - y - 1, 1, 1, GL_STENCIL_INDEX, GL_UNSIGNED_INT, &index);

  printf("Clicked on pixel %d, %d, color %02hhx%02hhx%02hhx%02hhx, depth %f, stencil index %u\n",
         x, y, color[0], color[1], color[2], color[3], depth, index);
}

The glReadPixels() function is quite straightforward. The first two parameters are the x and y offset in pixels, but OpenGL has y coordinates reversed compared to GLUT. The third and fourth parameter are the width and height of the region we are interested in. Since we only want one pixel, we specify a 1 by 1 region. Next is which component of the framebuffer we want to read. GL_RGBA reads the full color information, GL_DEPTH_COMPONENT reads the value of the depth buffer, and GL_STENCIL_INDEX the value of the stencil buffer. The sixth parameter is the format we want the data in. Note that for OpenGL ES 2.0, there may be restrictions which formats you can select for each type of information, depending on the hardware and driver capabilities of your graphics card. Last is the pointer to the variable or array we want to store the data in.

Of course, you will only get sensible depth or stencil information if you have the depth and stencil buffers enabled. The values you get for the color and stencil index are what you would expect. However, the depth value might be hard to interpret, although you will clearly see that for objects that are nearer to the camera, the depth value is smaller. The depth value will always be between 0 and 1, and by default the background will have a depth value of 1.

Exercises

  • Modify the textured cube tutorial to read the color, depth and stencil buffer information when you click in the window.

Unprojecting window coordinates edit

The mouse pointer x and y coordinates and the depth buffer z value are in so-called window coordinates, but are mostly useless. What we want is to convert them back to object space coordinates, which is the coordinate system we used to specify our vertex coordinates in. To do this, we need to apply the inverse of the transformation matrix to the window coordinates. The GLM library conveniently has a function for us that does exactly what we want: glm::unProject(). This is how it is used, given the view and projection matrices you used to display the scene:

  glm::vec4 viewport = glm::vec4(0, 0, window_width, window_height);
  glm::vec3 wincoord = glm::vec3(x, window_height - y - 1, depth);
  glm::vec3 objcoord = glm::unProject(wincoord, view, projection, viewport);

  printf("Coordinates in object space: %f, %f, %f\n",
         objcoord.x, objcoord.y, objcoord.z);

Note that if the depth value is 1, the coordinates will not really make sense, so you should check for that. Now that we know the object space coordinates, we can try to find out which object is closest to those coordinates. An easy technique is to loop through all the objects, and check the distance of the center of each object to those coordinates. This might not give exact matches though, especially if the objects have complex shapes. If the objects are on a regular grid, you can easily convert from object coordinates to grid coordinates. If you use an octree or BSP to store your geometry, you can traverse those data structures to quickly find where you clicked. However, if you want to know exactly which object is at the pixel you clicked, and the above methods are not good enough, then you can try the stencil buffer technique described in the next section.

Exercises:

  • You usually apply a model-view-projection matrix to the objects your draw. Why should you not include the model matrix in your unProjection?
  • Suppose you wrote a 2D game with an isometric projection, and you can draw your screen from bottom to top without using the depth buffer. Can you still use glm::unProject() with just the x and y coordinates?
  • Change the textured cube tutorial to render at least 10 smaller cubes. When clicking on the window, find out which cube's center is closest to the point you clicked.

Using the stencil buffer to identify objects edit

If you're not using the stencil buffer for anything else, you can use this to hold information about the objects on screen. Similar to how we would draw a color in the color buffer, we can draw a number in the stencil buffer. First, make sure you add GLUT_STENCIL to your call to glutInitDisplayMode(). Then, assuming we are drawing ten objects, we can draw each object's number to the stencil buffer this way:

void onDisplay() {
  glClearColor(...);
  glClearStencil(0); // this is the default value
  glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT|GL_STENCIL_BUFFER_BIT);

  /* Any other initialization goes here */
  ...

  /* Enable stencil operations */
  glEnable(GL_STENCIL_TEST);
  glStencilOp(GL_KEEP, GL_KEEP, GL_REPLACE);

  for(int i = 0; i < 10; i++) {
    glStencilFunc(GL_ALWAYS, i + 1, -1);
    draw_object(i);
  }
}

First, we clear the whole framebuffer, and ensure the stencil buffer contains only zeros. Next, we need to enable the stencil test, otherwise nothing will happen. We use glStencilOp(GL_KEEP, GL_KEEP, GL_REPLACE) to ensure the stencil buffer will be written to whenever the color buffer is written to (in particular, only when the depth test succeeds), and we will replace the existing value with a fixed new value. Then, right before drawing the object itself, we set the stencil function to always pass the stencil test, and the reference value to i + 1 (because 0 is already used for the background). We set the mask to have all bits enabled (you can also use ~0 or 0xff instead of -1 if you prefer). When reading back from the stencil buffer, you know you clicked on the background when you read a zero, otherwise you clicked on an object. Don't forget to subtract 1 when necessary.

Exercises

  • Modify the textured cube example to write to the stencil buffer. Make it so you can highlight cubes when you click on them.

More techniques edit

The two techniques above are simple and fast, but might not be good enough for you. Although the stencil buffer technique is the most accurate one, almost all graphics card only support an 8-bits stencil buffer. That means you can only have up to 255 objects uniquely identified. If you need more than 255 objects or are already using the stencil buffer for something else, consider using one of these alternatives:

  • Combining information from both the stencil buffer, color buffer, and object coordinates might give you a unique solution.
  • Draw the scene multiple times, using glScissor() to render only the pixel you are interested in. In every pass, you can get 8 bits of information from the stencil buffer, so 3 passes allows you to uniquely identify 16 million objects.
  • Draw the scene a second time, but instead of using the stencil buffer, give each object a unique solid color. This will give you a 24 or even 32 bits number for every pixel. Make sure you disable anti-aliasing though.

< OpenGL Programming

Browse & download complete code