I'm looking for any optimizations I can make in a graphics editing program

2023-02-10 11:55 问答作者：

Heyo, this is my first time asking a question here so do forgive me if I mess somethin up >~<

I'm working on a program similar to openCanvas, the earlier ones that allowed multiple people to draw on the same canvas in real time over the internet. OC's really buggy and has a lot of limitations, which is why I wanted to write this.

I have it set up so the canvas extends "indefinitely" in all directions and is made up of 512x512 blocks of pixels that don't become active until they're drawn on, which should be really easy to make, and I was thinking about using Direct3D to make it hardware accelerated, thus the 512 square blocks.

My problem comes when I want to use layers, I'm not quite sure how I can compose layers quickly and without using a ton of memory, since my target is DirectX9 compatible video cards with 128m of memory, and a system with about 3.2 ghz of CPU power and between 2 and 8 gigs of ram. I had a few different approaches I was thinking of using and was wondering which would probably be the best, and if there was anything I coul开发者_开发百科d look into to make it run better.

My first idea was to make the gfx hardware do as much work as possible by having all the layers on all the blocks serve as textures, and they'd be updated by locking the changed area, updating them on the cpu, and unlocking them. Blocks that aren't currently being changed are flattened into one texture and the individual layers themselves are kept in system memory, which would reduce the gfx memory used, but could significantly increase bandwidth usage between system and gfx memory. I can see the constant locking and unlocking potentially slowing down the system pretty bad as well. Another possible issue is that I've heard some people using up to 200 layers, and I can't think of any good ways to optimize that given the above.

My other idea was to compose the textures -completely- in system memory, write them into a texture, and copy that texture to gfx memory to be rendered in each block. This seems to eliminate a lot of the issues with the other method, but at the same time I'm moving all the work into the CPU, instead of balancing it. This isn't a big deal as long as it still runs quickly, though. Again, however, there's the issue of having a couple hundred layers. In this case though, I could probably only update the final pixels that are actually changing, which is what I think the bigger name programs like Sai and Photoshop do.

I'm mostly looking for recommendations, suggestions that might improve the above, better methods, or links to articles that might be related to such a project. While I'm writing it in C++, I have no trouble translating from other languages. Thanks for your time~

Data Structure
You should definitely use a quadtree (or another hierarchical data structure) to store your canvas and its nodes should contain much smaller blocks than 512x512 pixels. Maybe not as small as 1x1 pixels, because then the hierarchical overhead would kill you - you'll find a good balance through testing.

Drawing
Let your users only draw on one (the highest) resolution. Imagine a infinitely large uniform grid (two dimensional array). Since you know the position of the mouse and the amount which your users have scrolled from the origin, you can derive absolute coordinates. Traverse the quadtree into that region (eventually adding new nodes) and insert the blocks (for example 32x32) as the user draws them into the quadtree. I would buffer what the user draws in a 2D array (for example as big as his screen resolution) and use a separate thread to traverse/alter the quadtree and copy the data from the buffer to circumvent any delays.

Rendering
Traversing the quadtree and copying all tiles to one texture and send it to the GPU? No! You see, sending over one texture which is as big as the screen resolution is not the problem (bandwidth wise). But traversing the quadtree and assembling the final image is (at least if you want many fps). The answer is to store the quadtree in system memory and stream it from the GPU. Means: Asynchronously another thread does the traversal and copies currently viewed data to the GPU in chunks as fast as it can. If your user doesn't view the canvas in full resolution you don't have to traverse the tree to leaf level, which gives you automatic level of detail (LOD).

Some random thoughts regarding the proposed strategy

The quadtree approach is great because it is very memory efficient.
The streaming idea can be extended to the HDD...SeaDragon
A sophisticated implementation would require something like CUDA.
If your GPU doesn't offer the necessary performance/programmability, just implement the traversal on the CPU - a little longer delay until the image is fully displayed but should be acceptable. Don't forget to program asynchronously using multiple threads so that the screen doesn't freeze while waiting on the CPU. You can play with different effects: Showing the whole image at once, blurry at first and slowly increasing detail (breadth first search (BFS)) or render it tile by tile (depth first search (DFS)) - maybe mixed with some cool effects.
Software implementation should be pretty easy, when only allowing viewing of the canvas at full resolution. If one can zoom out in steps that's a minor change to the traversal. If one can zoom seamlessly, that'll require linear interpolation between neighboring quadtree nodes' tiles - not trivial anymore, but doable.
Layers: The quadtree should give you low enough memory consumption allowing you to simply store one quadtree per layer. But when you've many layers you'll need some optimizations to stay in realtime: You can't assemble 200 textures per frame and send them to the GPU. Maybe (not totally sure whether that's the best solution) for each layer deleting all nodes of quadtrees beneath that layer whose tile's pixels are completely covered by the layer above. That would need to be done at runtime while drawing and a depth-buffer is required. If you offer an eraser tool, you can't delete nodes but have to mark them as "invisible" so that they can be omitted during traversal.

..off the top of my head. If you have further questions, let me know!

继续阅读：directx performance windows

I'm looking for any optimizations I can make in a graphics editing program

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？