Overview:
Live Migration move running virtual machines from one physical server to another with no
impact to end users. It allows you to keeps your IT environment up and running, giving
you unprecedented flexibility and availability to meet the increasing demands of
your business and end users.
- Reduce IT costs and improve flexibility with server consolidation
- Decrease downtime and improve reliability with business continuity and disaster recovery
- Increase energy efficiency by running fewer servers and dynamically powering down unused servers with our green IT solutions
However, limitations of the current migration technology start to appear when
they are applied on larger application systems such as SAP ERP or SAP ByDesign.
Such systems consume a large amount of memory and cannot be transferred as
seamlessly as smaller ones, creating service interruption. Limiting the impact
and optimising migration becomes even more important with the generalisation of
Service Level Agreement (SLA). This strand of research within the hecatonchire
project aim at improving the live migration of VMs running large enterprise
applications without severely disrupting their live services, even across the
Internet.
How it works :
Full post copy live migration
- Stop the VM at the beginning
- Sending all the CPU and device states to the destination including the memory
- Send the RAM information and unmap the whole RAM memory region on Host B for RDMA connection
- .Immediately start KVM on Host B
- Host B will start page faulting and pull the page from Host A on demand ( + background prefetching)
Pre copy Vs Post copy live migration |
Hybrid post copy live migration
Hybrid Post copy live migration provide a middle ground between the full post copy and the pre-copy approach. It limit the impact of the page faulting by enabling the pre copy phase while providing a deterministic with reduce performance impact during the overall execution of the live migration.
Hybrid Live migration |
Architecture:
Post copy live migration |
If the VM touches a not-yet-transferred memory page, the VM page fault and initialise a memory request over RDMA using an In-kernel RDMA engine. This engine will copy the content of the memory page from the
source and resolve the page fault.
Prototype / Demo :
We present the design, implementation, and evaluation of post-copy based live
migration for virtual machines (VMs) across a Gigabit LAN. Post-copy migration
defers the transfer of a VM's memory contents until after its processor state
has been sent to the target host. This deferral is in contrast to the
traditional pre-copy approach, which first copies the memory state over multiple
iterations followed by a final transfer of the processor state. The post-copy
strategy can provide a "win-win" by reducing total migration time while
maintaining the liveness of the VM during migration.
The follwing Video demonstrate three different post copy live migration
scenario
- Full post copy
- Hybrid : 10 second timeout before switching to full post copy from standard live migration
- Hybrid 60 second time out : standard live migration finish within allocated time ( however we don’t follow the standard process as there is no stop and copy , just stop , send over cpu status and restart, missing page will be fetched on demand or by the background thread).
In comparison with the traditional approach we demonstrated that post-copy
improves several metrics including pages transferred, total migration time, and
network overhead. it also provide a deterministic live migration features which
is missing with traditional approach as the system administrator has no control
of workload placement and transfer.
Comparison between Yabusame and RDMA kernel approach :
Yabusame rely on a special character device driver allows transparent memory page retrievals
from a source host for the running VM at the destination. however as shown in the diagram above this require a lot of communication between different part as well as context switching which tend to be less than optimal. With the aproach we are proposing we are able to eliminate most of the overhead associated with memory transfer while improving overall performance.
Also with Yabusame the VM touches a not-yet-transferred memory page, it pause the VM temporarily. while with our approach we make full usage fo the asynchronous page fault system allowing us to avoid as much as possible to pause the system.
Lack of instantaneous stateful cloning forces users of cloud computing into ad hoc prac tices to manage application state and cycle provisioning. As a result, we aim to provides sub-second VM cloning, scales to hundreds of workers, consumes few cloud I/O resources, with negligible runtime overhead.
Comparison between Yabusame and RDMA kernel approach :
Yabusame rely on a special character device driver allows transparent memory page retrievals
from a source host for the running VM at the destination. however as shown in the diagram above this require a lot of communication between different part as well as context switching which tend to be less than optimal. With the aproach we are proposing we are able to eliminate most of the overhead associated with memory transfer while improving overall performance.
Also with Yabusame the VM touches a not-yet-transferred memory page, it pause the VM temporarily. while with our approach we make full usage fo the asynchronous page fault system allowing us to avoid as much as possible to pause the system.
Future Work : Flash Cloning :
Virtual Machine (VM) fork is a new cloud computing abstraction that instantaneously clones a VM into multiple replicas running on different hosts. All replicas share the sa e initial state, matching the intuitive semantics of statefull worker creation. VM fork thus enables the straightforward creation and efficient deployment of many tasks demanding swift instantiation of stateful workers in a cloud environment, e.g. excess load handling, opportunistic job placement, or parallel computing.Lack of instantaneous stateful cloning forces users of cloud computing into ad hoc prac tices to manage application state and cycle provisioning. As a result, we aim to provides sub-second VM cloning, scales to hundreds of workers, consumes few cloud I/O resources, with negligible runtime overhead.