To move the camera along its own local coordinate system's x-axis, you would simply need to compute the camera's "right" vector as the cross product of the camera's "direction" vector (which itself is subtraction of "look at" point and camera position, like you specify for gluLookAt) and the camera's "up" vector.
So: right = direction x up
Then, you would simply need to offset the camera's position and the "look at" position, that you give to gluLookAt, by any scaled "right" vector.
Note: Be sure to normalize the vectors first, or else your computed "right" vector would also not be normalized.
To move along the y-axis, you would offset the camera's position and "look at" position by a scaled "up" vector, which is NOT the "up" vector you specify via "gluLookAt" but is itself the cross product of your previously computed "right" vector and the "direction" vector of the camera.
So: up = right x direction
Computing the "up" vector like this ensures that, as you tilt your camera up and down (apply "pitch" that is), your "up" direction follows along and does not always point to world (0,1,0). So, moving up and down works then as expected.
One noteworthy thing is, that this computation gives you also an orthonormal basis (right, up, direction) for rotating any vector from local camera coordinate system to world coordinates. So this transformation is essentially what gluLookAt computes internally, only that gluLookAt also adds the translation components (camera position) to it and that it then computes the inverse of this transformation, since OpenGL wants to transform from world-to-camera.
Hope, that helps!