1. My answer:
Your method is fine.
a. Grab a buffer(s) the same size as the screen.
b. Plot into the buffer(s).
c. Blit/memcpy a buffer to the screen.
d. goto b
2. My answers:
a) Yes it should be.
b) Yes it could be... but..
Personally, I'd make sure the back-buffer was the same as the destination and have a custom plot routine for each type of output you anticipate (8bpp palettised, 16bpp, etc).
You should know what you're plotting, and how you're going to plot it, so you should be able to pre-process all of your data to fit the destination *once* (bitmaps, etc). Mangle the data once, don't try and do it every frame on the copy, 'cos that's going to be *slow*.
That's what I'd do, anyway, but then I'm a little twisted when it comes to low-level stuff (I like getting my hands dirty with assembler

).
Another thing to watch is the orientation of the screen buffer - that's come up here before - I think it was Digby (him again

) who thought it was best to pre-rotate the data/bitmaps, and I agree with him). Rotating a buffer onto the screen each frame is *nuts* (IMHO).
3. My answer: Pass.
I don't know that one, sorry. It'll probably depend on how the hardware sets out its palette (the GBA, for example has two 256-colour palettes for 8-bit mode - one for sprites, one for the tiles).
As before, if you can, pre-process your data into a palette, don't munge every pixel at runtime.
4. My answer: Define "best"
Fastest to code...
... is to generate raw data the same as your output.
Size is going to suck though, without compression. There are compression routines at <a href="http://www.devrs.com/gba/">http://www.devrs.com/gba</a>
Fastest to load...
... who cares - you're probably loading from RAM; it's going to be zippy. I'd stick the graphics in an external file rather than building it in (same for "levels").
Size...
bitmaps and raw data are costly size-wise. Compression is your friend. RLE bitmaps may help a bit, but it depends on your source data.
If you're doing one EXE to fit all pocket devices, then it's probably "best" to store the bitmaps (compressed) at 24bpp, then you'll be ok when devices with 24-bit displays come out.
The main concern for me with pocket devices is to keep the size down. If somebody writes a game but the memory footprint is huge, then I'm not interested. I think that's a general attitude, but I may (as always) be wrong.
If I were you, I'd target the device you own, get everything working with raw data, then add compression, etc to the mix.
Hope that helps,
Refractor.