api_resources.rs |
|
12725 |
batch.rs |
|
175887 |
border.rs |
|
50375 |
box_shadow.rs |
|
21498 |
bump_allocator.rs |
|
15978 |
capture.rs |
|
8789 |
clip.rs |
|
86885 |
command_buffer.rs |
|
17779 |
composite.rs |
Types and definitions related to compositing picture cache tiles
and/or OS compositor integration.
|
61425 |
compositor |
|
|
debug_colors.rs |
|
14804 |
debug_font_data.rs |
|
117993 |
debug_item.rs |
|
709 |
device |
|
|
ellipse.rs |
|
6061 |
filterdata.rs |
|
7719 |
frame_allocator.rs |
|
15716 |
frame_builder.rs |
|
49668 |
freelist.rs |
A generic backing store for caches.
`FreeList` is a simple vector-backed data structure where each entry in the
vector contains an Option<T>. It maintains an index-based (rather than
pointer-based) free list to efficiently locate the next unused entry. If all
entries are occupied, insertion appends a new element to the vector.
It also supports both strong and weak handle semantics. There is exactly one
(non-Clonable) strong handle per occupied entry, which must be passed by
value into `free()` to release an entry. Strong handles can produce an
unlimited number of (Clonable) weak handles, which are used to perform
lookups which may fail of the entry has been freed. A per-entry epoch ensures
that weak handle lookups properly fail even if the entry has been freed and
reused.
TODO(gw): Add an occupied list head, for fast iteration of the occupied list
to implement retain() style functionality. |
7682 |
glyph_cache.rs |
|
6759 |
gpu_cache.rs |
Overview of the GPU cache.
The main goal of the GPU cache is to allow on-demand
allocation and construction of GPU resources for the
vertex shaders to consume.
Every item that wants to be stored in the GPU cache
should create a GpuCacheHandle that is used to refer
to a cached GPU resource. Creating a handle is a
cheap operation, that does *not* allocate room in the
cache.
On any frame when that data is required, the caller
must request that handle, via ```request```. If the
data is not in the cache, the user provided closure
will be invoked to build the data.
After ```end_frame``` has occurred, callers can
use the ```get_address``` API to get the allocated
address in the GPU cache of a given resource slot
for this frame. |
33231 |
gpu_types.rs |
|
30107 |
hit_test.rs |
|
13546 |
image_source.rs |
This module contains the logic to obtain a primitive's source texture and uv rect.
Currently this is a somewhat involved process because the code grew into having ad-hoc
ways to store this information depending on how the image data is produced. The goal
is for any textured primitive to be able to read from any source (texture cache, render
tasks, etc.) without primitive-specific code. |
4199 |
image_tiling.rs |
|
28139 |
intern.rs |
The interning module provides a generic data structure
interning container. It is similar in concept to a
traditional string interning container, but it is
specialized to the WR thread model.
There is an Interner structure, that lives in the
scene builder thread, and a DataStore structure
that lives in the frame builder thread.
Hashing, interning and handle creation is done by
the interner structure during scene building.
Delta changes for the interner are pushed during
a transaction to the frame builder. The frame builder
is then able to access the content of the interned
handles quickly, via array indexing.
Epoch tracking ensures that the garbage collection
step which the interner uses to remove items is
only invoked on items that the frame builder thread
is no longer referencing.
Items in the data store are stored in a traditional
free-list structure, for content access and memory
usage efficiency.
The epoch is incremented each time a scene is
built. The most recently used scene epoch is
stored inside each handle. This is then used for
cache invalidation. |
15277 |
internal_types.rs |
|
69001 |
lib.rs |
!
A GPU based renderer for the web.
It serves as an experimental render backend for [Servo](https://servo.org/),
but it can also be used as such in a standalone application.
# External dependencies
WebRender currently depends on [FreeType](https://www.freetype.org/)
# Api Structure
The main entry point to WebRender is the [`crate::Renderer`].
By calling [`Renderer::new(...)`](crate::Renderer::new) you get a [`Renderer`], as well as
a [`RenderApiSender`](api::RenderApiSender). Your [`Renderer`] is responsible to render the
previously processed frames onto the screen.
By calling [`yourRenderApiSender.create_api()`](api::RenderApiSender::create_api), you'll
get a [`RenderApi`](api::RenderApi) instance, which is responsible for managing resources
and documents. A worker thread is used internally to untie the workload from the application
thread and therefore be able to make better use of multicore systems.
## Frame
What is referred to as a `frame`, is the current geometry on the screen.
A new Frame is created by calling [`set_display_list()`](api::Transaction::set_display_list)
on the [`RenderApi`](api::RenderApi). When the geometry is processed, the application will be
informed via a [`RenderNotifier`](api::RenderNotifier), a callback which you pass to
[`Renderer::new`].
More information about [stacking contexts][stacking_contexts].
[`set_display_list()`](api::Transaction::set_display_list) also needs to be supplied with
[`BuiltDisplayList`](api::BuiltDisplayList)s. These are obtained by finalizing a
[`DisplayListBuilder`](api::DisplayListBuilder). These are used to draw your geometry. But it
doesn't only contain trivial geometry, it can also store another
[`StackingContext`](api::StackingContext), as they're nestable.
[stacking_contexts]: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Positioning/Understanding_z_index/The_stacking_context
|
6305 |
lru_cache.rs |
This module implements a least recently used cache structure, which is
used by the texture cache to manage the lifetime of items inside the
texture cache. It has a few special pieces of functionality that the
texture cache requires, but should be usable as a general LRU cache
type if useful in other areas.
The cache is implemented with two types of backing freelists. These allow
random access to the underlying data, while being efficient in both
memory access and allocation patterns.
The "entries" freelist stores the elements being cached (for example, the
CacheEntry structure for the texture cache). These elements are stored
in arbitrary order, reusing empty slots in the freelist where possible.
The "lru_index" freelists store the LRU tracking information. Although the
tracking elements are stored in arbitrary order inside a freelist for
efficiency, they use next/prev links to represent a doubly-linked list,
kept sorted in order of recent use. The next link is also used to store
the current freelist within the array when the element is not occupied.
The LRU cache allows having multiple LRU "partitions". Every entry is tracked
by exactly one partition at any time; all partitions refer to entries in the
shared freelist. Entries can move between partitions, if replace_or_insert is
called with a new partition index for an existing handle.
The partitioning is used by the texture cache so that, for example, allocating
more glyph entries does not cause eviction of image entries (which go into
a different shared texture). If an existing handle's entry is reallocated with
a new size, it might need to move from a shared texture to a standalone
texture; in this case the handle will move to a different LRU partition.
|
23503 |
pattern.rs |
|
4072 |
picture.rs |
A picture represents a dynamically rendered image.
# Overview
Pictures consists of:
- A number of primitives that are drawn onto the picture.
- A composite operation describing how to composite this
picture into its parent.
- A configuration describing how to draw the primitives on
this picture (e.g. in screen space or local space).
The tree of pictures are generated during scene building.
Depending on their composite operations pictures can be rendered into
intermediate targets or folded into their parent picture.
## Picture caching
Pictures can be cached to reduce the amount of rasterization happening per
frame.
When picture caching is enabled, the scene is cut into a small number of slices,
typically:
- content slice
- UI slice
- background UI slice which is hidden by the other two slices most of the time.
Each of these slice is made up of fixed-size large tiles of 2048x512 pixels
(or 128x128 for the UI slice).
Tiles can be either cached rasterized content into a texture or "clear tiles"
that contain only a solid color rectangle rendered directly during the composite
pass.
## Invalidation
Each tile keeps track of the elements that affect it, which can be:
- primitives
- clips
- image keys
- opacity bindings
- transforms
These dependency lists are built each frame and compared to the previous frame to
see if the tile changed.
The tile's primitive dependency information is organized in a quadtree, each node
storing an index buffer of tile primitive dependencies.
The union of the invalidated leaves of each quadtree produces a per-tile dirty rect
which defines the scissor rect used when replaying the tile's drawing commands and
can be used for partial present.
## Display List shape
WR will first look for an iframe item in the root stacking context to apply
picture caching to. If that's not found, it will apply to the entire root
stacking context of the display list. Apart from that, the format of the
display list is not important to picture caching. Each time a new scroll root
is encountered, a new picture cache slice will be created. If the display
list contains more than some arbitrary number of slices (currently 8), the
content will all be squashed into a single slice, in order to save GPU memory
and compositing performance.
## Compositor Surfaces
Sometimes, a primitive would prefer to exist as a native compositor surface.
This allows a large and/or regularly changing primitive (such as a video, or
webgl canvas) to be updated each frame without invalidating the content of
tiles, and can provide a significant performance win and battery saving.
Since drawing a primitive as a compositor surface alters the ordering of
primitives in a tile, we use 'overlay tiles' to ensure correctness. If a
tile has a compositor surface, _and_ that tile has primitives that overlap
the compositor surface rect, the tile switches to be drawn in alpha mode.
We rely on only promoting compositor surfaces that are opaque primitives.
With this assumption, the tile(s) that intersect the compositor surface get
a 'cutout' in the rectangle where the compositor surface exists (not the
entire tile), allowing that tile to be drawn as an alpha tile after the
compositor surface.
Tiles are only drawn in overlay mode if there is content that exists on top
of the compositor surface. Otherwise, we can draw the tiles in the normal fast
path before the compositor surface is drawn. Use of the per-tile valid and
dirty rects ensure that we do a minimal amount of per-pixel work here to
blend the overlay tile (this is not always optimal right now, but will be
improved as a follow up). |
378770 |
picture_graph.rs |
|
6837 |
picture_textures.rs |
|
13722 |
prepare.rs |
# Prepare pass
TODO: document this! |
76881 |
prim_store |
|
|
print_tree.rs |
|
3278 |
profiler.rs |
# Overlay profiler
## Profiler UI string syntax
Comma-separated list of of tokens with trailing and leading spaces trimmed.
Each tokens can be:
- A counter name with an optional prefix. The name corresponds to the displayed name (see the
counters vector below.
- By default (no prefix) the counter is shown as average + max over half a second.
- With a '#' prefix the counter is shown as a graph.
- With a '*' prefix the counter is shown as a change indicator.
- Some special counters such as GPU time queries have specific visualizations ignoring prefixes.
- A preset name to append the preset to the UI (see PROFILER_PRESETS).
- An empty token to insert a bit of vertical space.
- A '|' token to start a new column.
- A '_' token to start a new row. |
79129 |
quad.rs |
|
51802 |
rectangle_occlusion.rs |
A simple occlusion culling algorithm for axis-aligned rectangles.
## Output
Occlusion culling results in two lists of rectangles:
- The opaque list should be rendered first. None of its rectangles overlap so order doesn't matter
within the opaque pass.
- The non-opaque list (or alpha list) which should be rendered in back-to-front order after the opaque pass.
The output has minimal overdraw (no overdraw at all for opaque items and as little as possible for alpha ones).
## Algorithm overview
The occlusion culling algorithm works in front-to-back order, accumulating rectangle in opaque and non-opaque lists.
Each time a rectangle is added, it is first tested against existing opaque rectangles and potentially split into visible
sub-rectangles, or even discarded completely. The front-to-back order ensures that once a rectangle is added it does not
have to be modified again, making the underlying data structure trivial (append-only).
## splitting
Partially visible rectangles are split into up to 4 visible sub-rectangles by each intersecting occluder.
```ascii
+----------------------+ +----------------------+
| rectangle | | |
| | | |
| +-----------+ | +--+-----------+-------+
| |occluder | | --> | |\\\\\\\\\\\| |
| +-----------+ | +--+-----------+-------+
| | | |
+----------------------+ +----------------------+
```
In the example above the rectangle is split into 4 visible parts with the central occluded part left out.
This implementation favors longer horizontal bands instead creating nine-patches to deal with the corners.
The advantage is that it produces less rectangles which is good for the performance of the algorithm and
for SWGL which likes long horizontal spans, however it would cause artifacts if the resulting rectangles
were to be drawn with a non-axis-aligned transformation.
## Performance
The cost of the algorithm grows with the number of opaque rectangle as each new rectangle is tested against
all previously added opaque rectangles.
Note that opaque rectangles can either be added as opaque or non-opaque. This means a trade-off between
overdraw and number of rectangles can be explored to adjust performance: Small opaque rectangles, especially
towards the front of the scene, could be added as non-opaque to avoid causing many splits while adding only
a small amount of overdraw.
This implementation is intended to be used with a small number of (opaque) items. A similar implementation
could use a spatial acceleration structure for opaque rectangles to perform better with a large amount of
occluders.
|
7526 |
render_api.rs |
|
54295 |
render_backend.rs |
The high-level module responsible for managing the pipeline and preparing
commands to be issued by the `Renderer`.
See the comment at the top of the `renderer` module for a description of
how these two pieces interact. |
80674 |
render_target.rs |
|
54273 |
render_task.rs |
|
127080 |
render_task_cache.rs |
|
14864 |
render_task_graph.rs |
This module contains the render task graph.
Code associated with creating specific render tasks is in the render_task
module. |
47679 |
renderer |
|
|
resource_cache.rs |
|
93600 |
scene.rs |
|
13309 |
scene_builder_thread.rs |
|
31249 |
scene_building.rs |
# Scene building
Scene building is the phase during which display lists, a representation built for
serialization, are turned into a scene, webrender's internal representation that is
suited for rendering frames.
This phase is happening asynchronously on the scene builder thread.
# General algorithm
The important aspects of scene building are:
- Building up primitive lists (much of the cost of scene building goes here).
- Creating pictures for content that needs to be rendered into a surface, be it so that
filters can be applied or for caching purposes.
- Maintaining a temporary stack of stacking contexts to keep track of some of the
drawing states.
- Stitching multiple display lists which reference each other (without cycles) into
a single scene (see build_reference_frame).
- Interning, which detects when some of the retained state stays the same between display
lists.
The scene builder linearly traverses the serialized display list which is naturally
ordered back-to-front, accumulating primitives in the top-most stacking context's
primitive list.
At the end of each stacking context (see pop_stacking_context), its primitive list is
either handed over to a picture if one is created, or it is concatenated into the parent
stacking context's primitive list.
The flow of the algorithm is mostly linear except when handling:
- shadow stacks (see push_shadow and pop_all_shadows),
- backdrop filters (see add_backdrop_filter)
|
196293 |
screen_capture.rs |
Screen capture infrastructure for the Gecko Profiler and Composition Recorder. |
17518 |
segment.rs |
Primitive segmentation
# Overview
Segmenting is the process of breaking rectangular primitives into smaller rectangular
primitives in order to extract parts that could benefit from a fast paths.
Typically this is used to allow fully opaque segments to be rendered in the opaque
pass. For example when an opaque rectangle has a non-axis-aligned transform applied,
we usually have to apply some anti-aliasing around the edges which requires alpha
blending. By segmenting the edges out of the center of the primitive, we can keep a
large amount of pixels in the opaque pass.
Segmenting also lets us avoids rasterizing parts of clip masks that we know to have
no effect or to be fully masking. For example by segmenting the corners of a rounded
rectangle clip, we can optimize both rendering the mask and the primitive by only
rasterize the corners in the mask and not applying any clipping to the segments of
the primitive that don't overlap the borders.
It is a flexible system in the sense that different sources of segmentation (for
example two rounded rectangle clips) can affect the segmentation, and the possibility
to segment some effects such as specific clip kinds does not necessarily mean the
primitive will actually be segmented.
## Segments and clipping
Segments of a primitive can be either not clipped, fully clipped, or partially clipped.
In the first two case we don't need a clip mask. For each partially masked segments, a
mask is rasterized using a render task. All of the interesting steps happen during frame
building.
- The first step is to determine the segmentation and write the associated GPU data.
See `PrimitiveInstance::build_segments_if_needed` and `write_brush_segment_description`
in `prim_store/mod.rs` which uses the segment builder of this module.
- The second step is to generate the mask render tasks.
See `BrushSegment::update_clip_task` and `RenderTask::new_mask`. For each segment that
needs a mask, the contribution of all clips that affect the segment is added to the
mask's render task.
- Segments are assigned to batches (See `batch.rs`). Segments of a given primitive can
be assigned to different batches.
See also the [`clip` module documentation][clip.rs] for details about how clipping
information is represented.
[clip.rs]: ../clip/index.html
|
46363 |
space.rs |
Utilities to deal with coordinate spaces. |
9499 |
spatial_node.rs |
|
44881 |
spatial_tree.rs |
|
79976 |
surface.rs |
Contains functionality to help building the render task graph from a series of off-screen
surfaces that are created during the prepare pass. For now, it maintains existing behavior.
A future patch will add support for surface sub-graphs, while ensuring the render task
graph itself is built correctly with dependencies regardless of the surface kind (chained,
tiled, simple).
|
27217 |
telemetry.rs |
|
2430 |
texture_cache.rs |
|
68680 |
texture_pack |
|
|
tile_cache.rs |
Types and functionality related to picture caching. In future, we'll
move more and more of the existing functionality out of picture.rs
and into here.
|
28180 |
util.rs |
|
56307 |
visibility.rs |
# Visibility pass
TODO: document what this pass does!
|
15047 |