You're right; with X shifts involved, it gets a
little messier.. but not
that much. I think you'll be alright if you're plotting into your "buffer" in rows and you have an "extra" line at the end of the buffer. That way, you *still* have a contiguous block of memory to shunt into the screen memory.. well in one or two chunks anyway
For the X, it's the same way of updating as the Y - you just plot to the last column(s) that went off the screen. Then you just use the offset to dump the buffer into screen memory and there's your X-Y scroller without any extraneous memcpys (and only a little bit of extra handling working on the buffer wrap-around... which is definitely faster than a large memcpy).
I'd advise getting some paper and drawing lots of pictures

... So long as you can start at any point and draw a full 320 pixel row you're alright. Columns are a little trickier and may need a wrap-around adding (like if(column>240) {column-=240;}).
For the sake of speed, I'd probably lose a bit of scrolling granularity on the PocketPCs and scroll 32-bits at a time = 2 pixels. The 1-2 pixel jump probably isn't all that visible and it'll be much faster plotting 32-bit chunks rather than worrying about aligning things or plotting 16-bits at a time.
Anyway, I'm sure you can work it out. I'm off to bed.
Cheers, g'night

,
Ref.