Amethyst Legion ECS Evolution RFC

Legion ECS Evolution

Following lengthy discussion on both Discord and the Amethyst Forum (most of which, including chat logs, can be found here ), we propose with this RFC to move Amethyst from SPECS to Legion, an ECS framework building on concepts in SPECS Parallel ECS, as well as lessons learned since. This proposal stems from an improved foundational flexibility in the approach of Legion which would be untenable to affect on the current SPECS crate without forcing all users of SPECS to essentially adapt to a rewrite centered on the needs of Amethyst. The flexibility in Legion is filled with tradeoffs, generally showing benefits in performance and runtime flexibility, while generally trading off some of the ergonomics of the SPECS interface. While the benefits and the impetus for seeking them is described in the "Motivations" section, the implictions of tradeoffs following those benefits will be outlined in greater detail within the "Tradeoffs" section.

There are some core parts of Amethyst which may either need to considerably change when moving to Legion, or would otherwise just benefit from substantial changes to embrace the flexibility of Legion. Notably, systems in Legion are FnMut closures, and all systems require usage of SystemDesc to construct the closure and its associated Query structure. The dispatch technique in Legion is necessarily very different from SPECS, and the parts of the engine dealing with dispatch may also be modified in terms of Legion's dispatcher. Furthermore, the platform of Legion provides ample opportunity to improve our Transform system, with improved change detection tools at our disposal. These changes as we understand them are described below in the "Refactoring" section.

The evaluation of this large transition requires undertaking a progressive port of Amethyst to Legion with a temporary synchronization shim between SPECS and Legion. This effort exists here , utilizing the Legion fork here . Currently, this progressive fork has fully transitioned the Amethyst Renderer, one of the largest and most involved parts of the engine ECS-wise, and is capable of running that demo we're all familiar with:

Not only can you take a peek at what actual code transitioned directly to Legion looks like in this fork, but the refactoring work in that fork can be utilized given this RFC is accepted while actively helping to better inform where there may be shortcomings or surprises in the present.

Motivations

The forum thread outlines the deficiencies we are facing with specs in detail. This table below is a high level summary of the problems we are having with specs, and how legion solves each one.

Specs Legion Typed storage nature prevents proper FFI All underlying legion storage is based on TypeId lookups for resources and components hibitsetcrate has allocation/reallocation overhead, branch misses Archetypes eliminate the need of entity ID collections being used for iteration Sparse storages causes cache incoherence Legion guarantees allocation of simliar entities into contigious, aligned chunks with all their components in linear memory Storage fetching inherently causes many branch mispredictions See previous Storage methodology inherently makes FlaggedStorage not thread safe. Queries in legion store filter and change state, allowing for extremely granular change detection on a Archetype, Chunk and Entity level. Component mutation flagging limited to any mutable access Legion dispatches on a Archetype basis instead of component, allowing to parallel execute across the same component data, but just different entities. *A special case exists for sparse read/write of components where this isnt the case Parallelization limited to component-level, no granular accesses See previous information about Archetypes Many elided and explicit lifetimes throughout make code less ergonomic System API designed to hide the majority of these lifetimes in safety and ergonomic wrappers ParJoin has mutation limitations See previous statements about system dispatcher and Archetypes

Immediate Benefits in a Nutshell

Significant performance gains
Scripting RFC can move forward
Queries open up many new optimizations for change detection such as culling, the transform system, etc.
More granular parallelization than we already have achieved
Resolves the dispatcher Order of Insertion design flaws
???

Tradeoffs

These are some things I have ran into that were cumbersome changes or thoughts while porting. This is by no means comprehensive. Some of these items may not make sense until you understand legion and/or read the rest of this RFC.

Systems are moved to a closure, but ergonomics are given for still maintaining state, mainly in the use of FnMut [ref] for the closure, and an alternative build_disposable [ref] .
All systems are built with closures, causing some initialization design changes in regards to reference borrowing
The SystemDesc/System types have been removed.
a Trait type cannot be used for System declaration, due to the typed nature of Queries in legion. It is far more feasible and ergonomic to use a closures for type deduction. The except to this case is thread-local execution, which can still be typed for ease of use.

Refactoring

This port of amethyst from legion -> specs has aimed to keep to some of the consistencies of specs and what Amethyst users would already be familiar with. Much of the implementation of Legion and the amethyst-specific components was heavily inspired/copied from the current standing implementations.

SystemBundle, System and Dispatcher refactor

This portion of the port will have the most significant impact on users, as this is where their day-to-day coding exists. The following is an example of the same system, in both specs and legion.

High Level Changes

Systems are all now FnMut closures. This allows for easier declaration and type deduction. They can capture variables from their builder for state. Additional ‘disposable’ build types are available for more complex stateful modes.
System data declarations are now all within a builder, and not on a trait.
Component data is now accessed via "queries" instead of “component storages”
Component addition/removal is now deferred , in line with entity creation/removal
Default resource allocation is removed, all world resource access now return Option<Ref>
System registration explicit dependecies are removed , now execution is ordered based on "Stages", which can be explicit priorities, but all system execution is flattened into a single data-dependent execution.

Following is an example of a basic system in both specs and legion

Specs

impl<'a> System<'a> for OrbitSystem {
    type SystemData = (
        Read<'a, Time>,
        ReadStorage<'a, Orbit>,
        WriteStorage<'a, Transform>,
        Write<'a, DebugLines>,
    );

    fn run(&mut self, (time, orbits, mut transforms, mut debug): Self::SystemData) {
        for (orbit, transform) in (&orbits, &mut transforms).join() {
            let angle = time.absolute_time_seconds() as f32 * orbit.time_scale;
            let cross = orbit.axis.cross(&Vector3::z()).normalize() * orbit.radius;
            let rot = UnitQuaternion::from_axis_angle(&orbit.axis, angle);
            let final_pos = (rot * cross) + orbit.center;

            debug.draw_line(
                orbit.center.into(),
                final_pos.into(),
                Srgba::new(0.0, 0.5, 1.0, 1.0),

            );
            transform.set_translation(final_pos);
        }
    }
}

Legion

fn build_orbit_system(
    world: &mut amethyst::core::legion::world::World,
) -> Box<dyn amethyst::core::legion::schedule::Schedulable> {
    SystemBuilder::<()>::new("OrbitSystem")
        .with_query(<(Write<Transform>, Read<Orbit>)>::query())
        .read_resource::<Time>()
        .write_resource::<DebugLines>()
        .build(move |commands, world, (time, debug), query| {
            query
                .iter_entities()
                .for_each(|(entity, (mut transform, orbit))| {
                    let angle = time.absolute_time_seconds() as f32 * orbit.time_scale;
                    let cross = orbit.axis.cross(&Vector3::z()).normalize() * orbit.radius;
                    let rot = UnitQuaternion::from_axis_angle(&orbit.axis, angle);

                    let final_pos = (rot * cross) + orbit.center;

                    debug.draw_line(
                        orbit.center.into(),
                        final_pos.into(),
                        Srgba::new(0.0, 0.5, 1.0, 1.0),
                    );
                    transform.set_translation(final_pos);
                });
        })
}

Example bundle Changes

RenderBundle - Specs

impl<'a, 'b, B: Backend> SystemBundle<'a, 'b> for RenderingBundle<B> {
    fn build(
        mut self,
        world: &mut World,
        builder: &mut DispatcherBuilder<'a, 'b>,
    ) -> Result<(), Error> {
        builder.add(MeshProcessorSystem::<B>::default(), "mesh_processor", &[]);
        builder.add(
            TextureProcessorSystem::<B>::default(),
            "texture_processor",
            &[],
        );

        builder.add(Processor::<Material>::new(), "material_processor", &[]);
        builder.add(
            Processor::<SpriteSheet>::new(),
            "sprite_sheet_processor",
            &[],
        );

        // make sure that all renderer-specific systems run after game code
        builder.add_barrier();
        for plugin in &mut self.plugins {
            plugin.on_build(world, builder)?;
        }
        builder.add_thread_local(RenderingSystem::<B, _>::new(self.into_graph_creator()));
        Ok(())
    }
}

RenderBundle - Legion

impl<'a, 'b, B: Backend> SystemBundle for RenderingBundle<B> {
    fn build(mut self, world: &mut World, builder: &mut DispatcherBuilder) -> Result<(), Error> {
        builder.add_system(Stage::Begin, build_mesh_processor::<B>);
        builder.add_system(Stage::Begin, build_texture_processor::<B>);
        builder.add_system(Stage::Begin, build_asset_processor::<Material>);
        builder.add_system(Stage::Begin, build_asset_processor::<SpriteSheet>);

        for mut plugin in &mut self.plugins {
            plugin.on_build(world, builder)?;

        }

        let config: rendy::factory::Config = Default::default();
        let (factory, families): (Factory<B>, _) = rendy::factory::init(config).unwrap();
        let queue_id = QueueId {
            family: families.family_by_index(0).id(),
            index: 0,
        };

        world.resources.insert(factory);
        world.resources.insert(queue_id);

        let mat = crate::legion::system::create_default_mat::<B>(&world.resources);
        world.resources.insert(crate::mtl::MaterialDefaults(mat));

        builder.add_thread_local(move |world| {
            build_rendering_system(world, self.into_graph_creator(), families)
        });

        Ok(())
    }
}

Parallelizationof mutable queries

One of the major benefits of legion is its granularity with queries. Specs is not capable of performing a parralel join of Transform currently, because FlaggedStorage is not thread safe. Additionally, a mutable join such as above automatically flags all Transform components as mutated, meaning any readers will get N(entities) events.

In legion, however, we get this short syntax: query.par_for_each(|(entity, (mut transform, orbit))| { Under the hood, this code actually accomplishes more than what ParJoin may in specs. This method threads on a per-chunk basis on legion, meaning similiar data is being linearly iterated, and all components of those entities are in cache.

Transform Refactor (legion_transform)

Legion transform implementation

@AThilenius has taken on the task of refactoring the core Transform system. This system had some faults of its own, which were also exacerbated by specs. The system itself is heavily tied in with how specs operates, so a rewrite of the transform system was already in the cards for this migration.

Hierarchy

This refactor is aimed towards following the Unity design, where the source-of-truth for the hierarchy (hot data) is stored in Parent components (ie. a child has a parent). This has the added benefit of ensuring only tree structures can be formed at the API level. Along with the Parent component, the transform system will create/update a Children component on each parent entity. This is necessary for efficient root->leaf iteration of trees, which is a needed operation for many systems but it should be noted that the Children component is only guaranteed valid after the transform systems have run and before any hierarchy edits have been made. Several other methods of storing the hierarchy were considered and prototyped, including an implicit linked-list, detailed here . Given all the tradeoffs and technical complexity of various methods (and because a very large game engine company has come to the same conclusion) the current method was chosen. More info can be found in the readme of legion_transform .

Transform

The original Amethyst transform was problematic for several reasons, largely because it was organically grown:

The local_to_world matrix was stored in the same component as the affine-transform values.
- This also implies that use of the original transform component was tightly coupled with the hierarchy it belongs to (namely it’s parent chain).
The component was a full Affine transform (for some reason split between an Isometry and a non-uniform scale stored as a Vector3).
Much of the nalgebra API for 3D space transform creation/manipulation was replicated with little benefit.

Given the drawbacks of the original transform, it was decided to start from a clean slate, again taking inspiration from the new Unity ECS. User defined space transforms come in the form of the following components:

Translation (Vector3 XYZ translation)
Rotation (UnityQuaternion rotation)
Scale (single f32, used for uniform scaling, ie. where scale x == y == z)
NonUniformScale (a Vector3 for non-uniform scaling, which should be avoided when possible)

Any valid combinatoric of these components can also be added (although Scale and NonUniformScale are mutually exclusive). For example, if your entity only needs to translate, you need only pay the cost of storing and computing the translation and can skip the cost of storing and computing (into the final homogenius matrix4x4) the Rotation and Scale.

The final homogeneous matrix is stored in the LocalToWorld component, which described the space transform from entity->world space regardless of hierarchy membership. In the event that an entity is a member of a Hierarchy, an additional LocalToParent components (also a homogeneous matrix4x4) will be computed first and used for the final LocalToWorld update. This has the benefits of:

The LocalToWorld matrix will always exist and be updated for any entity with a space transform (ex any entity that should be rendered) regardless of hierarchy membership.
Any entity that is static (or is part of a static hierarchy) can have it’s LocalToWorld matrix pre-baked and the other transform components need not be stored.
No other system that doesn’t explicitly care about the hierarchy needs to know anything about it (ex rendering needs only the LocalToWorld) component.

Dispatcher refactor

The Dispatcher has been rewritten to utilize the built-in legion StageExecutor, while layering amethyst needs on top of it.

The builder and registration process still looks fairly similar; the main difference being naming is now debug only , and explicit dependencies have been removed in favor of inferred insertion order via Stages, and then full parallel execution. ThreadLocal execution still also exists, to execute the end of any given frame in the local game thread. This means that a Stage or RelativeStage can be used to infer the insertion order of the system, but it will still execute based on its data dependencies, and not strictly its place "in line".

Migration Story

World Synchronization Middleware

Because of the fundemental changes inherent in this migration, significant effort has gone into making transitioning, and using both "old" and “new” systems as seamless as possible. This does come with significant performance cost, but should allow people to utilize the a mix of specs and legion while testing migration.

The engine provides a "LegionSyncer" trait; this is dispatched to configure and handle syncing of resources and components between specs and legion
Underneath these LegionSyncer traits, lies a set of automatic syncer implementations for the common use cases. This includes resource and component synchronization between the two worlds
Dispatching already exists for both worlds; dispatch occurs as:
1. Run specs
2. Sync world Specs -> Legion
3. Run Legion
4. Sync world Legion -> Specs
Helper functions have been added to the GameDataBuilder and DispatcherBuilder to streamline this process.

In the current design, syncers are not enabled by default and must be explicitly selected by the user via the game data builder. For example:

.migration_resource_sync::<Scene>()
.migration_resource_sync::<RenderMode>()
.migration_component_sync::<Orbit>()
.migration_sync_bundle(amethyst::core::legion::Syncer::default())
.migration_sync_bundle(amethyst::renderer::legion::Syncer::<DefaultBackend>::default())

The above will explicitly synchronize the specifed resources and components. Additionally, the "Sync bundles" are provided for synchronizing the features out of any given crate.

This synchronization use can be seen in the current examples, as they still utilize a large amount of unported specs systems.

Proposed Timeline

With the synchronization middleware available, it gives users the ability to slowly transition to the new systems while actively testing their project. I propose the following release timeline for this, allowing users to skip versions as we go and work between:

The current implementation is feature gated behind "legion-ecs" feature. This can be released as a new version of Amethyst to begin migration
The next release performs a "hard swap", from “specs default” to “Legion default”. This would rename all the migration_with_XXX functions to specs, and make legion default. This would also include ironing out the legion world defaulting in the dispatchers and builders.
The next release removes specs entirely, leaving legion in its place.

Brain-fart of changes needed by users

Render Plugins/Passes
- Change renderer::bundle::* imports to renderer::legion::bundle
- Change renderer::submodules::* imports to renderer::legion::submodules
- All resource access changes from world to world.resources
  - fetch/fetch_mut change to get/get_mut
    - get/get_mut return "Option", not default
  - Read and Write remove lifetime
  - ReadExpect/WriteExpect change to Read/Write
  - You can still use the same <(Data)>::fetch(&world.resources) syntax
- Resources need to cache their own queries
  - THIS IS RECOMMENDED DONE IN A CLOSURE, TO PREVENT TYPING NIGHTMARE
- amethyst/amethyst-imgui@ 06c1a58 a commit showing a port

Legion ECS Evolution

Motivations

Immediate Benefits in a Nutshell

Tradeoffs

Refactoring

SystemBundle, System and Dispatcher refactor

High Level Changes

Following is an example of a basic system in both specs and legion

Specs

Legion

Example bundle Changes

Parallelizationof mutable queries

Transform Refactor (legion_transform)

Hierarchy

Transform

Dispatcher refactor

Migration Story

World Synchronization Middleware

Proposed Timeline

Brain-fart of changes needed by users

Recommend

How Big Is The Number — Tree(3)

2019天猫双11狂欢夜收视登顶

最前线｜丈夫的额度比妻子高，苹果发的信用卡歧视女性？

遍览科创板 | 净利润年均增长70%，上市后急跌80%，「天准科技」业绩上演“过山车”

车企转型之路，智能互联的用户体验突围

机器具备“理解”能力究竟是什么意思？

5G毫米波离商用还有多远？

FPX横扫G2夺S9冠军，背后公司一年狂赚44亿，出海收入超腾讯网易

国盛证券：车企加速投放10万元以下车型，低端市场显现基数效应

《双子杀手》《终结者6》亏损2亿美元，派拉蒙面临“黑暗命运”

About Joyk