# Many-Core Timing Analysis of Real-Time Systems and its application to an industrial processor

# Hamza Rihani

Université Grenoble Alpes / Verimag

December 1, 2017



#### Jury:

| Pr. Jan Reineke        | Saarland University                | Reviewer   |
|------------------------|------------------------------------|------------|
| Pr. Christine Rochange | Université de Toulouse             | Reviewer   |
| Dr. Robert I. Davis    | University of York                 | Examiner   |
| Dr. Benoît de Dinechin | Kalray SA                          | Examiner   |
| Dr. Claire Maïza       | Université Grenoble Alpes          | Supervisor |
| Dr. Matthieu Moy       | Université Claude Bernard - Lyon 1 | Advisor    |
|                        |                                    |            |

#### Introduction: Real-Time Systems

Many-Core Timing Analysis of Real-Time Systems

Definition (Real-Time Systems)

A system that must produce valid outputs before a deadline.

### Introduction: Real-Time Systems

Many-Core Timing Analysis of Real-Time Systems

#### Definition (Real-Time Systems)

A system that must produce valid outputs before a deadline.

- Soft Real-Time
  - Global Positioning System device
  - Smartphones

#### Hard Real-Time

- Automatic Braking System
- Flight Management System







### Introduction: Real-Time Systems

Many-Core Timing Analysis of Real-Time Systems

#### Definition (Real-Time Systems)

A system that must produce valid outputs before a deadline.

- Soft Real-Time
  - Global Positioning System device
  - Smartphones
- Hard Real-Time ★
  - Automatic Braking System
  - Flight Management System





Many-Core **Timing Analysis** of Real-Time Systems

How long will the truck wait to cross the road?



Many-Core **Timing Analysis** of Real-Time Systems

How long will the truck wait to cross the road?





waits for the green light

# Many-Core **Timing Analysis** of Real-Time Systems

#### How long will the truck wait to cross the road?





waits for the green light

grants each direction at a time

Many-Core **Timing Analysis** of Real-Time Systems

#### How long will the truck wait to cross the road?



waits for the green light grants each direction at a time gives a priority to the cars

Many-Core **Timing Analysis** of Real-Time Systems

#### How long will the truck wait to cross the road?



- Crossroad is a shared resource
- Vehicles request accesses to pass
- Arbitration Policies:



Time Division Multiple Access

Round Robin

Fixed Priority

#### Introduction: Many-Cores in Real Time Systems



4 / 52

#### Introduction: Many-Cores in Real Time Systems



#### Timing analysis of cores

• Existing tools for pipeline and cache analyses

### Introduction: Many-Cores in Real Time Systems



#### Timing analysis of cores

• Existing tools for pipeline and cache analyses

#### Where is the potential interference?

- O Shared buses and memory \*
- NoC routing
- Shared I/O controllers

### Contributions

#### Contribution 1

Analysis of Time Division Multiple Access policy

• Approach based on Satisfiability Modulo Theory



### Contributions

#### Contribution 1

Analysis of Time Division Multiple Access policy

• Approach based on Satisfiability Modulo Theory



#### Contribution 2

Response time analysis of a many-core processor

- Synchronous Data Flow programs
- Model of the shared bus arbiter



The High Five, Dallas, Texas, USA

### Outline

#### I TDMA Bus Timing Analysis



Many-Core Response Time Analysis



# **TDMA Bus Timing Analysis**

















• **Offsets** *off*<sub>1</sub>, *off*<sub>2</sub>, *off*<sub>3</sub> relative to the TDMA period:

 $off_{\{1,2,3\}} = time\_instant(req_{\{1,2,3\}}) \mod \pi$ 

# **Outline: TDMA Bus Timing Analysis**

Approaches in WCET Analysis of TDMA

- 2 WCET Analysis by SMT EncodingNaive SMT Approach
  - Offset-based SMT Encoding
- 3 Experimental Evaluation
- 4 Summary and Future Work of Part I



# **Outline: TDMA Bus Timing Analysis**

#### Approaches in WCET Analysis of TDMA

- 2 WCET Analysis by SMT Encoding
  - Naive SMT Approach
  - Offset-based SMT Encoding
- 3 Experimental Evaluation
- 4 Summary and Future Work of Part I

TDMA Bus Timing Analysis

Many-Core Response Time Analysis





Goal: Estimate the WCET



#### Goal: Estimate the WCET

- → Existing approaches:
- 1 Worst-case everywhere

[Altmeyer et al., 2015; Rosèn et al., 2007...]





### Approaches in WCET Analysis of TDMA

2

Capture all possible offsets

[Kelter et al., 2014]

[Chattopadhyay et al., 2010]



Feasible Path Analysis with SMT

[Henry et al., 2014]

### Approaches in WCET Analysis of TDMA



# **Outline: TDMA Bus Timing Analysis**

Approaches in WCET Analysis of TDMA

- WCET Analysis by SMT Encoding
   Naive SMT Approach
   Offset-based SMT Encoding
- 3 Experimental Evaluation
- 4 Summary and Future Work of Part I

TDMA Bus Timing Analysis

Many-Core Response Time Analysis



- Bounded Model Checking
  - Encode the semantics into a Satisfiability Modulo Theory problem

- Bounded Model Checking
  - Encode the semantics into a Satisfiability Modulo Theory problem

$$\underbrace{SMT \text{ query}}_{assert(\land expr)} = \text{``Is there a trace with a feasible path?''}$$

• SMT-solver response:

- SAT: There is a feasible execution path
- UNSAT: There is no feasible execution path

- Bounded Model Checking
  - Encode the semantics into a Satisfiability Modulo Theory problem
- Add execution times on the paths

$$\underbrace{SMT \text{ query}}_{assert(\land expr)} = \text{``Is there a trace with a feasible path}$$

- SMT-solver response:
  - SAT: There is a feasible path with an execution time >X
  - UNSAT: X is an upper-bound on WCET

- Bounded Model Checking
  - Encode the semantics into a Satisfiability Modulo Theory problem
- Add execution times on the paths

$$\underbrace{SMT \text{ query}}_{assert(\land expr)} = \text{"Is there a trace with a feasible path}$$

- SMT-solver response:
  - SAT: There is a feasible path with an execution time >X
  - UNSAT: X is an upper-bound on WCET

#### Goal

Find the smallest X, such that Execution Time > X is UNSAT

### **Example: Semantics and Timing Encoding**



#### Previous work in [Henry et al., 2014] $\blacktriangleright$ b; "true" $\stackrel{\text{def}}{\longleftrightarrow}$ B; executed

▶ 
$$t_{i,j}$$
 "true"  $\iff B_i \to B_j$  taken

### **Example: Semantics and Timing Encoding**



Previous work in [Henry et al., 2014]

►  $b_i$  "true"  $\stackrel{\text{def}}{\longleftrightarrow} B_i$  executed

► 
$$t_{i,j}$$
 "true"  $\iff B_i \to B_j$  taken

•  $e_{i,j}$  execution time at transition  $B_i \rightarrow B_j$ 

### **Example: Semantics and Timing Encoding**



### Naive SMT Encoding



*tdma\_cost(e<sub>entry</sub>)*: returns the execution time of a bus access



### Performance of the Naive Encoding



#### Performance of the Naive Encoding



#### Naive SMT Encoding



#### Offset-based SMT Encoding



• off<sub>*i*,*j*</sub> =  $e_{i,j} \mod \pi$ 

#### Offset-based SMT Encoding



19 / 52

#### Offset-based SMT Encoding



*tdma\_cost*: returns the time after a bus access



*tdma\_offset*: returns the offset after a bus access



#### Performance of the Offset-based Encoding



## **Outline: TDMA Bus Timing Analysis**

Approaches in WCET Analysis of TDMA

- 2 WCET Analysis by SMT Encoding
  - Naive SMT Approach
  - Offset-based SMT Encoding

#### 3 Experimental Evaluation

4 Summary and Future Work of Part I

TDMA Bus Timing Analysis



### **Proof-of-Concept Implementation**



### **Proof-of-Concept Implementation**



### **Evaluation: Benchmark Descriptions**

#### Benchmark from TACLEBench suite <sup>1</sup>

| Name         | Description                                           | #LLVM instr. | #bus access |
|--------------|-------------------------------------------------------|--------------|-------------|
| bs           | Binary search                                         | 231          | 12          |
| insertsort   | Insertion sort on<br>a reversed array                 | 493          | 65          |
| jfdctint     | Discrete Cosine<br>Transformation                     | 2334         | 448         |
| fdct         | Fast Discrete<br>Cosine<br>Transform                  | 2502         | 385         |
| compressdata | Data<br>compression<br>program adopted<br>from SPEC95 | 674          | 131         |
| fly-by-wire  | UAV fly-by-wire software                              | 2815         | 515         |

<sup>1</sup>https://github.com/tacle/tacle-bench

#### **Evaluation: Experiments**



Comparison between estimated WCET and pessimistic WCET

## **Outline: TDMA Bus Timing Analysis**

Approaches in WCET Analysis of TDMA

- 2 WCET Analysis by SMT Encoding
  - Naive SMT Approach
  - Offset-based SMT Encoding
- 3 Experimental Evaluation
- 4 Summary and Future Work of Part I

TDMA Bus Timing Analysis



#### Summary of Part I

- SMT encodings for TDMA access
- Feasible path analysis combined with the WCET computation
- Comparison between different encodings
- Validation with small but relevant benchmarks

#### Find in the manuscript:

- Linearization of SMT encoding (modulo operators)
- Other possible SMT encodings

#### Future Work of Part I



#### Future Work of Part I



→ SMT is an interesting research direction for WCET Analysis

### From TDMA to Other Arbitration Policy



### From TDMA to Other Arbitration Policy







#### Analysis of Large Multi and Many-Cores



- Exact analysis
- Account for any interference globally during the task's execution

#### Analysis of Large Multi and Many-Cores



- Exact analysis
- Account for any interference globally during the task's execution
- **3** Exploit any information about:
  - Kalray MPPA2
- The target architecture
  - Reduce the interference
  - Model precisely the shared resources
- The target application model Synchronous Data Flow



### **Outline: Many-Core Response Time Analysis**

- Implementation Choices of Synchronous Data Flow Programs 5
- Multicore Response Time Analysis of SDF Programs 6
- Target Many-Core: Kalray MPPA2
- Evaluation
- Summary and Future Work of Part II 9





### **Outline: Many-Core Response Time Analysis**

#### Implementation Choices of Synchronous Data Flow Programs 5

- Multicore Response Time Analysis of SDF Programs
- Target Many-Core: Kalray MPPA2
- 9 Summary and Future Work of Part II

TDMA Bus Timing Analysis





#### Implementation Choices: SDF on Multi/Many-Cores



Static, time triggered, non-preemptive scheduling



#### Implementation Choices: SDF on Multi/Many-Cores



#### Implementation Choices: SDF on Multi/Many-Cores





#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order



#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order Each task τ<sub>i</sub>:





#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order Each task τ<sub>i</sub>:





#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order

Each task  $\tau_i$ :

• Release date (*rel*<sub>i</sub>). Response time (*R*<sub>i</sub>)





#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order

Each task  $\tau_i$ :

• Release date (*rel*<sub>i</sub>). Response time (*R*<sub>i</sub>)





#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order

Each task  $\tau_i$ :

• Release date  $(rel_i)$ . Response time  $(R_i)$ 



# Model of the Application



#### An execution instance is:

- Direct Acyclic Task Graph
- Mono-rate (or at least harmonic rates)
- Fixed mapping and execution order

Each task  $\tau_i$ :

• Release date  $(rel_i)$ . Response time  $(R_i)$ 

## Static Non-Preemptive Scheduling

Q Find R<sub>i</sub> including interference
Q Find rel<sub>i</sub> respecting dependencies



# **Outline: Many-Core Response Time Analysis**

#### Multicore Response Time Analysis of SDF Programs 6

- Target Many-Core: Kalray MPPA2
- 9 Summary and Future Work of Part II

TDMA Bus Timing Analysis



Many-Core Response Time Analysis



# **Response Time Analysis**







# **Response Time Analysis**



$$\forall i: R_i = WCET_i + I^{BUS}(R_i) + \dots$$

For each task *i*:

- Response Time
- WCET in isolation
- Bus Interference







#### Dependent tasks



# **Response Time Analysis**









**Q** Recursive formula  $\Rightarrow$  iterative algorithm.





1 Start with initial release dates





Start with initial release dates
 Compute response times

38 / 52





Start with initial release dates
 Compute response times

... ...



Start with initial release dates
 Compute response times

 ... ... a fixed-point is reached!





- **1** Start with initial release dates
- 2 Compute response times ... ... a fixed-point is reached!
- 3 Update the release dates





- 1 Start with initial release dates
- 2 Compute response times
  - ... ... a fixed-point is reached!
- Update the release dates
- 4 Repeat





- 1 Start with initial release dates
- 2 Compute response times ... ... a fixed-point is reached!
- 3 Update the release dates
- 4 Repeat until no release date changes (another fixed-point iteration).







- Convergence of the 1<sup>st</sup> fixed-point iteration:
  - $\circ\,$  Monotonic and bounded  $\checkmark\,$















# **Outline: Many-Core Response Time Analysis**

- Multicore Response Time Analysis of SDF Programs
- Target Many-Core: Kalray MPPA2
- 9 Summary and Future Work of Part II



Many-Core Response Time Analysis





- Kalray MPPA2 (codenamed Bostan)
- $\circ~16~compute~clusters$  + 4 I/O clusters
- Dual NoC



#### Per cluster:

- $\circ$  16 cores + 1 Resource Manager
- NoC Tx, NoC Rx, Debug Unit
- 16 shared memory banks (total: 2 MB)



#### Per cluster:

- $\circ$  16 cores + 1 Resource Manager
- NoC Tx, NoC Rx, Debug Unit
- 16 shared memory banks (total: 2 MB)
- 1 bus arbiter per memory bank



#### Per cluster:

- $\circ$  16 cores + 1 Resource Manager
- NoC Tx, NoC Rx, Debug Unit
- 16 shared memory banks (total: 2 MB)
- 1 bus arbiter per memory bank

- Possible spatial isolation assigning memory banks to cores
- Task execution model:
  - execute in a "local" bank
  - write to a "remote" bank
- Interference from communications

















42 / 52

# **Outline: Many-Core Response Time Analysis**

- Multicore Response Time Analysis of SDF Programs
- Target Many-Core: Kalray MPPA2

#### Evaluation 8

9 Summary and Future Work of Part II

TDMA Bus Timing Analysis



Many-Core Response Time Analysis



# Case Study: ROSACE, a Flight Management System Controller<sup>2</sup>



<sup>&</sup>lt;sup>2</sup> Pagetti et al., 2014

Case Study: ROSACE, a Flight Management System Controller<sup>2</sup>



#### Unrolled execution



# Evaluation: ROSACE Case Study

| Function   | WCET (cycles) | Memory accesses |
|------------|---------------|-----------------|
| altitude   | 275           | 22              |
| az_filter  | 274           | 22              |
| h_filter   | 326           | 24              |
| q_filter   | 338           | 24              |
| va_control | 303           | 24              |
| va_filter  | 301           | 23              |
| vz_control | 320           | 25              |
| vz_filter  | 334           | 25              |

- Values obtained from measurements
- $\circ\,$  Memory accesses from data and instruction cache misses + communications
- Moreover:
  - NoC Rx: writes 5 words
  - NoC Tx: reads 2 words

# Evaluation: ROSACE Case Study

| Function   | WCET (cycles) | Memory accesses |
|------------|---------------|-----------------|
| altitude   | 275           | 22              |
| az_filter  | 274           | 22              |
| h_filter   | 326           | 24              |
| q_filter   | 338           | 24              |
| va_control | 303           | 24              |
| va_filter  | 301           | 23              |
| vz_control | 320           | 25              |
| vz_filter  | 334           | 25              |

- Values obtained from measurements
- Memory accesses from data and instruction cache misses + communications
- Moreover:
  - NoC Rx: writes 5 words
  - NoC Tx: reads 2 words

**L** Experiments: Find the smallest schedulable hyper-period













INPUT SCADE/Lustre application

> Executable binary for the Kalray MPPA Bostan OUTPUT

47 / 52



| Executable binary |  |  |
|-------------------|--|--|
| for the Kalray    |  |  |
| MPPA Bostan       |  |  |
| OUTPUT            |  |  |



| Executable binary |  |  |
|-------------------|--|--|
| for the Kalray    |  |  |
| MPPA Bostan       |  |  |
| OUTPUT            |  |  |



# **Outline: Many-Core Response Time Analysis**

- Multicore Response Time Analysis of SDF Programs
- Target Many-Core: Kalray MPPA2
- Summary and Future Work of Part II 9



Many-Core Response Time Analysis



• A response time analysis of synchronous data flow programs on the Kalray MPPA2

• A response time analysis of synchronous data flow programs on the Kalray MPPA2

• Given:

- Task profiles: WCET in isolation and number of accesses
- Mapping of Tasks
- Execution Order

• A response time analysis of synchronous data flow programs on the Kalray MPPA2

• Given:

- Task profiles: WCET in isolation and number of accesses
- Mapping of Tasks
- Execution Order
- We compute:
  - Tight response times taking into account the interference
  - Release dates respecting the dependency constraints

• A response time analysis of synchronous data flow programs on the Kalray MPPA2

• Given:

- Task profiles: WCET in isolation and number of accesses
- Mapping of Tasks
- Execution Order
- We compute:
  - Tight response times taking into account the interference
  - Release dates respecting the dependency constraints

model of the multi-level arbiter

• A response time analysis of synchronous data flow programs on the Kalray MPPA2

• Given:

- Task profiles: WCET in isolation and number of accesses
- Mapping of Tasks
- Execution Order



• A response time analysis of synchronous data flow programs on the Kalray MPPA2

• Given:

- Task profiles: WCET in isolation and number of accesses
- Mapping of Tasks
- Execution Order
- We compute:
  - Tight response times taking into account the interference
  - Release dates respecting the dependency constraints

double fixed-point algorithm

model of the multi-level arbiter

#### • Find in the manuscript:

- $\circ~$  Execution phases: execution phase + communication phase
- Support of: accesses pipelining, blocking and non-blocking accesses, bursts of accesses
- More experiments with randomly generated task graphs

- Model of the Resource Manager
- Analysis with a Real-Time Operating System



tighter estimation of context switches and other interrupts

- Model of the Resource Manager
- Analysis with a Real-Time Operating System

| s                     | NoC Rx NoC Tx         |       |                       |                 |                       |
|-----------------------|-----------------------|-------|-----------------------|-----------------|-----------------------|
| 8 shared memory banks | R                     | М     | D                     | SU              | 8 shared memory banks |
| mory                  | $P_6$                 | P7    | P <sub>14</sub>       | P <sub>15</sub> | d me                  |
| р<br>р                | <i>P</i> <sub>4</sub> | $P_5$ | P <sub>12</sub>       | P <sub>13</sub> | mory                  |
| share                 | P <sub>2</sub>        | $P_3$ | P <sub>10</sub>       | P <sub>11</sub> | ban                   |
| 8                     | $P_0$                 | $P_1$ | <i>P</i> <sub>8</sub> | <i>P</i> 9      | Ś                     |

tighter estimation of context switches and other interrupts

- Model of the Resource Manager
- Analysis with a Real-Time Operating System
- Model of the NoC accesses.

| s                     | NoC                   | NoC Rx                |                 | NoC Tx          |                       |  |
|-----------------------|-----------------------|-----------------------|-----------------|-----------------|-----------------------|--|
| 8 shared memory banks | R                     | RM D                  |                 | 5U              | 8 shared memory banks |  |
| mory                  | $P_6$                 | P7                    | P <sub>14</sub> | P <sub>15</sub> | d me                  |  |
| em be                 | <i>P</i> <sub>4</sub> | $P_5$                 | P <sub>12</sub> | P <sub>13</sub> | mory                  |  |
| share                 | <i>P</i> <sub>2</sub> | <i>P</i> <sub>3</sub> | P <sub>10</sub> | P <sub>11</sub> | ban                   |  |
| 8                     | <i>P</i> <sub>0</sub> | $P_1$                 | P <sub>8</sub>  | <i>P</i> 9      | ŝ                     |  |





- Multi/Many-cores in Real-Time Systems
  - Full isolation: for example TDMA
  - Bounded interference

- Multi/Many-cores in Real-Time Systems
  - Full isolation: for example TDMA
  - Bounded interference
- Precise modeling of hardware components

- Multi/Many-cores in Real-Time Systems
  - Full isolation: for example TDMA
  - Bounded interference
- Precise modeling of hardware components
- Research directions for multi-core analysis:

Multi-core scheduling and timing analysis framework (in RTSOPS 2016)



# Many-Core Timing Analysis of Real-Time Systems and its application to an industrial processor

# Hamza Rihani

Université Grenoble Alpes / Verimag

#### **Publications:**

- Rihani, Hamza et al. (2015). "WCET analysis in shared resources real-time systems with TDMA buses". In: Proceedings of the 23rd International Conference on Real-Time Networks and Systems.
- Rihani, Hamza, Claire Maiza, and Matthieu Moy (2016a). "Efficient Execution of Dependent Tasks on Many-Core Processors". In: RTSOPS 2016. 7th International Real-Time Scheduling Open Problems Seminar. Toulouse, France.
- Rihani, Hamza et al. (2016b). "Response Time Analysis of Synchronous Data Flow Programs on a Many-Core Processor". In: Proceedings of the 24th International Conference on Real-Time Networks and Systems (RTNS).



This work is funded by grant CAPACITES (PIA-FSN2 n°P3425-146798) from the French *Ministère de l'économie, des finances et de l'industrie.* 

# **References** I



Altmeyer, Sebastian et al. (2015). "A Generic and Compositional Framework for Multicore Response Time Analysis". In: Proceedings of the 23rd International Conference on Real Time and Networks Systems (RTNS), pp. 129–138.

Chattopadhyay, Sudipta, Abhik Roychoudhury, and Tulika Mitra (2010). "Modeling Shared Cache and Bus in Multicores for Timing Analysis". In: *Proceedings of the 13th International Workshop on Software and Compilers for Embedded Systems*. SCOPES '10. St. Goar, Germany: ACM, 6:1–6:10.

Henry, Julien et al. (2014). "How to Compute Worst-case Execution Time by Optimization Modulo Theory and a Clever Encoding of Program Semantics". In: Proceedings of the 2014 SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES), pp. 43–52.

Kelter, Timon et al. (2014). "Static Analysis of Multi-core TDMA Resource Arbitration Delays". In: *Real-Time Syst.* 50.2, pp. 185–229.

Rosèn, Jacob et al. (2007). "Bus Access Optimization for Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip". In: *RTSS 2007*.

#### BACKUP



<sup>&</sup>lt;sup>1</sup>Altmeyer et al., RTNS 2015

Example: Fixed Priority bus arbiter, PE1 > PE0Bus access delay = 10



• Task of interest running on PE0:

 $R_0 = 10 + 3 \times 10$  (response time in isolation)

<sup>&</sup>lt;sup>1</sup>Altmeyer et al., RTNS 2015

Example: Fixed Priority bus arbiter, PE1 > PE0Bus access delay = 10



• Task of interest running on PE0:

 $R_0 = 10 + 3 \times 10$  (response time in isolation)  $R_1 = 10 + 3 \times 10 + 2 \times 10 = 60$ 

<sup>&</sup>lt;sup>1</sup>Altmeyer et al., RTNS 2015



• Task of interest running on PE0:

 $R_0 = 10 + 3 \times 10 \text{ (response time in isolation)}$  $R_1 = 10 + 3 \times 10 + 2 \times 10 = 60$  $R_2 = 10 + 3 \times 10 + 2 \times 10 + 2 \times 10 = 80$ 

<sup>&</sup>lt;sup>1</sup>Altmeyer et al., RTNS 2015



• Task of interest running on PE0:

 $R_0 = 10 + 3 \times 10$  (response time in isolation)

 $R_1 = 10 + 3 \times 10 + 2 \times 10 = 60$ 

 $R_2 = 10 + 3 \times 10 + 2 \times 10 + 2 \times 10 = 80$ 

 $R_3 = 10 + 3 \times 10 + 2 \times 10 + 2 \times 10 + 0 = 80$  (fixed-point)

<sup>&</sup>lt;sup>1</sup>Altmeyer et al., RTNS 2015

# **Evaluation: Runtime Perfomance**



Analysis time of randomly generated task graphs in log-log scale

# **Randomly Generated Task Graphs**







# **Cached Memory Operations**





by off<sub>*i*,*j*</sub> = 
$$e_{i,j} \mod \pi$$
  
Encode the costs of the basic blocks  
 $e_{i,j}$  (absolute time) ---→  $c_{i,j}$  (cost

 $c_{i,j}$  = ite  $t_{i,j}$  cost 0

"ite  $C \land B$ "  $\Leftrightarrow$  "if C then  $\land$  else B"







if  $\alpha < \pi$  then  $\alpha$  else  $\alpha - \pi$ 



# Using TDMA functions



write\_value(y) return()

# Using TDMA functions



*if..then..else (ite)* encoding:

sum encoding:

$$\begin{aligned} & \textit{off}_{i,j} = ( \text{if } t_{1,i} \text{ then } \textit{off}_{1,i} \\ & \textit{else if } t_{2,i} \text{ then } \textit{off}_{2,i} \\ & \textit{else } \dots \\ & \textit{else if } t_{K,i} \text{ then } \textit{off}_{K,i} \text{ else } \textit{0} ) \end{aligned}$$

$$off_i = \sum_{k=1}^{k=K} off_{k,i}$$
$$off_{i,j} = \text{if } t_{i,j} \text{ then } off_i \text{ else } 0$$

# **Performance 3**



#### How it works?

• Example with binary search:

```
Testing wcet >= 0... SAT (value found = 18).

New interval = [18, 73].

Testing wcet >= 46... UNSAT. New interval = [18, 45].

Testing wcet >= 32... UNSAT. New interval = [18, 31].

Testing wcet >= 25... UNSAT. New interval = [18, 24].

Testing wcet >= 21... UNSAT. New interval = [18, 20].

Testing wcet >= 19... UNSAT. New interval = [18, 18].

The maximum value of wcet is 18.

Computation time is 0.010000s
```

# **Evaluation:** Analysis Time

#### Analysis time

| Name         | $\pi$ = 40, $\sigma$ = 20, acc = 10 | $\pi = 400, \ \sigma = 200, \ acc = 40$ |
|--------------|-------------------------------------|-----------------------------------------|
| bs           | 0.45s                               | 0.80s                                   |
| insertsort   | 1.37s                               | 7.19s                                   |
| jfdctint     | 44.10s                              | 55.47s                                  |
| fdct         | 41.36s                              | 34.42s                                  |
| compressdata | 4.66s                               | 3.44s                                   |
| fly-by-wire  | 28.78s                              | 109.37s                                 |