How are you timing the Present call?
Realize that D3DM buffers all of your rendering commands until it absolutely must send them to the driver. For most games, this will be when the app calls Present. For this reason, if you are simply recording the elapsed time of Present, you are also measuring the time it takes the driver to parse and execute everything already in the command buffer since the last flush of commands to the driver.
If you want to measure the speed of the Present call alone, then you can force a flush of the command buffer by adding a call to lock/unlock the back buffer just prior to starting your timing of Present.
The Present operation from the back buffer to the front buffer should be performed in hardware on the 2700G, so that shouldn't be a bottleneck in your app.
For most games, optimum performance will occur when running fullscreen, with the swapeffect set to discard, and the presentation interval set to immediate (this could cause tearing though).
If you're running in a window, make sure that your back buffer is the same size as the window's client rect and that the pixel format is the same as the primary surface. You don't want the driver to do any sort of conversion (stretch/shink/color) during each Present.