Under the hood

How does IOTIFY simulates your job when you submit the run? What happens behind the scene? Learn more about our orchestration strategy.

One of the key requirement of a simulator is the seamless scalability without deteriorating simulation performance. IOTIFY is designed to be truly horizontally scalable in this regard i.e. add more client to the simulation without affecting the baseline performance of other clients. The architecture of IoTIFY utilizes docker container extensively, enabling seemless global scalability. As long as the IOTIFY agents have connectivity to your IoT backend, they could run and simulate the job. However, to distribute, orchestrate and collect the results of the test, we have certain internal strategies which are worth having a look at.

Let's follow your simulation as it is submitted from either the UI or through the API.

Job Submission

When a simulation job is submitted, it is sent to an job queue. Based on the current availability of the node and the job settings, the total numbers of clients required for simulation are further divided into smaller groups, let's say 100 clients each. Simulation for each chunk of client is then submitted as an individual task, i.e. if you would like to simulate 1000 clients, your first 100 clients could be simulated by one VM while the next 100 clients could be running on another machine. Once all tasks have been distributed, the job is marked as running.

Connection Initiation

When a task is scheduled at a node agent, it calls the Device Model setup function before establishing connection to the server. This is particularly useful if you want to setup credentials for the connection or even dynamically control which client should connect to which server.

When a task starts, all the clients within the task must complete the setup function before the first message function could be run for the first client. The timeout limit specified in the template comes into the play here. If a particular client could not connect to the server within the specified time, it's result will be failed and it will not be able to send any messages to the cloud platform. So in summary, all clients must either successfully connect or definitely fail to connect before the first iteration could be executed and first message could be sent to the server. A client who fails to connect will simply sit idle while the other clients could continue to run normally.

The connection limits/second is an advanced variable which slows down connection initiation across all the clients in all of the jobs to apply the global limit. Note that you should increase the connection timeout value if you are applying global connection limits.

Message Sending

Once the setup function has been run and connection established, all clients will send messages independently to the server. The clients within the same task will all be sending messages almost simultaneously, however, the execution of the task across multiple containers may not be fully synchronized. This means that all clients in your simulation may not start exactly at the same time (which is a good thing btw), however the interval between their message sending will be fixed. This effect also mimic the real world behavior of the devices which do not send data at exactly the same time.