<span class="author">Claude Tadonki</span> Papers

P. Kiepas, J. Kozlak, C. Tadonki and C. Ancourt,
Profile-based Vectorization for MATLAB,
5th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming ARRAY 2018, Philadelphia, USA - June 19, 2018.

Yassir Samadi, Mostapha Zbakh, and C. Tadonki,
Workflow Scheduling Issues and Techniques in Cloud Computing: A Systematic Literature Review, Cloud Computing and Big Data: Technologies, Applications and Security, Zbakh, M., Essaaidi, M., Manneback, P., Rong, C. (Eds.), ISBN 978-3-319-97719-5, Springer, 2018.

Patryk Kiepas (MINES ParisTech / PSL University), Corinne Ancourt, C. Tadonki, and Jarosław Koźlak(AGH University of Science and Technology, Kraków)
Using performance event profiles to deduce an execution model of MATLAB with Just-In-Time compilation,
32nd Workshop on Languages and Compilers for Parallel Computing (LCPC 2019), ATLANTA - USA, OCTOBER 22-24, 2019.

L. Bouhouch, M. Zbakh, C. Tadonki
Data Migration - Cloudsim Extension,
3rd International Conference on Big Data Research (ICBDR 2019), Paris - France, 20-22 November, 2019.

L. Bouhouch, M. Zbakh, C. Tadonki
A Big Data Placement Strategy in Geographically Distributed Datacenters,
5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications, CloudTech 2020, Marrakesh, Morocco, November 24-26, 2020.

Alexandre Azevedo, Cristiana Bentes, Maria Clicia Stelling de Castro, C. Tadonki
Performance Analysis and Optimization of the Vector-Kronecker Product Multiplication,
WAMCA - 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2020, Porto, Portugal, September 9-11, 2020.

L. Bouhouch, M. Zbakh, C. Tadonki
A New Classification for Data Placement Techniques in Cloud Computing,
6th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications, CloudTech 2023, Marrakesh, Morocco, November 21-23, 2023.

Claude Tadonki, and Bernard Philippe,
Parallel multiplication of a vector by a Kronecker product of matrices,
Parallel Distributed Computing Practices PDCP, volume 2(4), 1999.

Abstract. We provide a parallel algorithm for the Kronecker product of matrices based on a cyclic partition of loops. Our general model unifies one scheme with redundant computations but no communication and an opposite scheme without redundant computation but with interprocessor communications. Both schema can be mixed with a certain balance influenced by the volume of floating point operations and the characteristics of the target architecture. This work has valuated experimental validations ({\small CENJU and INTEL PARAGON}).

Claude Tadonki, and Bernard Philippe,
Parallel multiplication of a vector by a Kronecker product of matrices (part II),
Parallel Distributed Computing Practices PDCP, volume 3(3), 2000.

Abstract. This paper presents a revised and generalized version of our previous parallel algorithm for the Kronecker product of matrices. We show how to map the computation on any number of processors that is a divisor of the product of the matrices sizes (instead of being a prefix of this product as previously). Moreover, we show that the minimum number of parallel communications with p processors is log(p) whatever the algorithm, and that our algorithm achieves this optimal performance. The work is validated by significant efficiencies obtained from experimental measures on the {\small CRAY} machine.

Sanjay Rajopadhye, Tanguy Risset, et Claude Tadonki,
Algebraic Path Problem on linear arrays,
Techniques et Sciences Informatiques TSI, 20 (5), 2001.

Abstract. We seek a linear SPMD implementation of the Warshal algorithm for the Algebraic Path Problem (Unified model of the transitive closure, shortest paths, Gauss elimination, ...). Our parallel algorithm is systolic-like in its original version, and offers the important advantage of being able to run on a shorter number of processors by a natural round- robbing remapping. We derive and validate a blocked version for standard distributed memory parallel machines, as the cost of data communication would be severe otherwise. We show experimental validations on the {\small CRAY} machine.

Claude Tadonki,
A Recursive Method for Graph Scheduling,
Scientific Annals of Cuza University, Vol 11, p.121-131, 2002

Abstract. This paper presents a recursive graph scheduling method. Our paradigm applies on any acyclic graph that can be partitioned into isomorphic subgraphs. Indeed, many common problem domains are 3D graphs that can be organized as a chain of isomorphic 2D subgraphs. Starting from any valid schedule of the source subgraph of the chain, we use the generic isomorphism to systematically derive a valid global schedule, with a local pipeline between two consecutive subgraphs. Our technique is then characterized by the nature of the partition and the source schedule. We discuss the impact of these characteristics on the complexity on the generated parallel schedules.

C. Beltran, C. Tadonki and J.-Ph. Vial,
Solving the p-median problem with a semi-Lagrangian relaxation,
Computational Optimization and Applications, Volume 35(2), October 2006. (pdf)

Abstract. This paper deals with operation research and non differentiable optimization. The so-called P-median problem is the problem of locating P "facilities" relative to a set of "customers" such that the sum of the shortest demand weighted distance between "customers" and "facilities" is minimized. Indeed, this a classical combinatorial optimization problem with a huge set of potential solutions. Using a semi-Lagrangian relaxation, we tackle the problem in its associated continuous formulation and report our non-smooth convex optimization engineering results.

F. Babonneau, C. Beltran, A. Haurie, C. Tadonki and J.-P. Vial,
Proximal-ACCPM: a versatile oracle based optimization method,
Computational and Management Science, Volume 9, 2007. (pdf)

Abstract. Oracle Based Optimization (OBO) conveniently designates an approach to handle a class of convex optimization problems in which the information pertaining to the function to be minimized and/or to the feasible domain takes the form of a linear outer approximation revealed by an oracle. We show how difficult problems can be cast in this format, and then solved within our context. We present our method, so-called Proximal-ACCPM, to trigger the OBO approach and give a snapshot on numerical results. This paper summarizes several contributions with the OBO approach and aims to give, in a single report, enough information on the method and its implementation to facilitate new applications.

C. Tadonki,
Mathematical and Computational Engineering in X-Ray Crystallography,
International Journal of Advanced Computer Engineering, volume 1(2) 2008.

Abstract. The main purpose of X-Ray Crystallography is to predict a macromolecular structure using from X-Ray synchrotron radiation. Among existing paradigms to achieve this task, analytical approaches come up as good candidates for automation trough mathematical approximations and computational engineering. In addition to the later, statistical processing are required in order to refine the data according to the physical model and the boarding effects of the experiments. We revisit the basis of the problem and focus on a more precise effect, so-called radiation damage, for which it has been also proven that it can be artificially managed to become instructive.

C. Tadonki,
Integer Programming Heuristic for the Dual Power Setting Problem in Wireless Sensors Networks,
Int. Journal of Advanced Research in Computer Engineering, vol 3(1) 2009.

Abstract. We seek an integer programming based heuristic for solving the dual power management problem in wireless sensor networks. For a given network with two possible transmission powers (low and high), the problem is to find a minimum size subset of nodes such that if they are assigned high transmission power while the others are assigned low transmission power, the network will be strongly connected. The main purpose behind this efficient setting is to minimize the total communication power consumption while maintaining the network connectivity. In a theoretical point of view, the problem is known to be difficult to solve exactly. An approach to approximate the solution is to work with a spanning tree of clusters. Each cluster is a strongly connected component when consider low transmission power. We follow the same approach, and we formulate the node selection problem inside clusters as an integer programming problem which is solved exactly using specialized codes. Experimental results show that our algorithm is efficient both in execution time and solution quality.

C. Tadonki, G. Grodidier, O. Pene,
An efficient CELL library for lattice quantum chromodynamics,
ACM SIGARCH Computer Architecture News, vol 38(4) 2011.

Abstract. Quantum chromodynamics (QCD) is the theory of subnuclear physics, aiming at modeling the strong nuclear force, which is responsible for the interactions of nuclear particles. Numerical QCD studies are performed through a discrete formalism called LQCD (Lattice Quantum Chromodynamics). Typical simulations involve very large volume of data and numerically sensitive entities, thus the crucial need of high performance computing systems. We propose a set of CELL-accelerated routines for basic LQCD calculations. Our framework is provided as a unified library and is particularly optimized for an iterative use. Each routine is parallelized among the SPUs, and each SPU achieves it task by looping on small chunk of arrays from the main memory. Our SPU implementation is vectorized with double precision data, and the cooperation with the PPU shows a good overlap between data transfers and computations. Moreover, we permanently keep the SPU context and use mailboxes to synchronize between consecutive calls. We validate our library by using it to derive a CELL version of an existing LQCD package (tmLQCD). Experimental results on individual routines show a significant speedup compare to standard processor, 11 times better than a 2.83 GHz INTEL processor for instance (without SSE). This ratio is around 9 (with QS22 blade) when consider a more cooperative context like solving a linear system of equations (usually referred as Wislon-Dirac inversion). Our results clearly demonstrate that the CELL is a very promising way for high-scale LQCD simulations.

T. Saidani, L. Lacassagne, J. Falcou, C. Tadonki, Samir Bouaziz,
Parallelization Schemes for Memory Optimization on the Cell Processor : A Case Study on the Harris Corner Detector,
Transactions on High-Performance Embedded Architectures and Compilers, volume 3(3) 2011.

Abstract. The Cell processor is a typical example of heterogeneous multiprocessor on-chip architecture that uses several levels of parallelism to deliver high performance. Although its efficiency potential, the execution mode and part of hardware specificities make it being non trivial to deal with. Indeed, reducing the gap between peak performance and effective performance is the challenge for compiler design and efficient implementations. Image processing and media applications are typical "main stream" applications one could consider while investigating on Cell benchmarks. Our investigations, trough various implementation of the Harris detection algorithm, reveal that the impact of DMA controlled data transfers and synchronizations between SPEs are key points for global performance.

El Wardani Dadi, El Mostafa Daoudi, C. Tadonki,
Improving 3D Shape Retrieval Methods based on Bag-of-Feature Approach by using Local Codebooks,
International Journal of Future Generation Communication and Networking, Vol. 5, No. 4, December, 2012.

Abstract. Recent investigations illustrate that view-based methods, with pose normalization preprocessing get better performances in retrieving rigid models than other approaches and still the most popular and practical methods in the field of 3D shape retrieval [1, 2, 3, 4, 5]. In this paper we present an improvement of 3D shape retrieval methods based on bag-of features approach. These methods use this approach to integrate a set of features extracted from 2D views of the 3D objects using the SIFT (Scale Invariant Feature Transform [6]) algorithm into histograms using vector quantization which is based on a global visual codebook. In order to improve the retrieval performances, we propose to associate to each 3D object its local visual codebook instead of a unique global codebook. Experimental results obtained on the Princeton Shape Benchmark database, for the BF-SIFT method proposed by Ohbuchi, et al., and CM-BOF proposed by Zhouhui et al., show that the proposed approach performs better than its original.

D. Barthou, O. Brand-Foissac, O. Pene, G. Grosdidier, R. Dolbeau, C. Eisenbeis, M. Kruse, K. Petrov, C. Tadonki,
Automated Code Generation for Lattice Quantum Chromodynamics and beyond,
Journal of Physics: Conference Series, Institute of Physics: Open Access Journals, 510, pp.012005, 2014.

Abstract. We present here our ongoing work on a Domain Specific Language which aims to simplify Monte-Carlo simulations and measurements in the domain of Lattice Quantum Chromodynamics. The tool-chain, called Qiral, is used to produce high-performance OpenMP C code from LaTeX sources. We discuss conceptual issues and details of implementation and optimization. The comparison of the performance of the generated code to the well-established simulation software is also made.

C. Tadonki, F. Meyer, and F. Irigoin,
Dendrogram Based Algorithm for Dominated Graph Flooding,
Procedia Computer Science, vol(29), pp. 586-598, 2014.

Abstract. In this paper, we are concerned with the problem of flooding undirected weighted graphs un- der ceiling constraints. We provide a new algorithm based on a hierarchical structure called dendrogram, which offers the significant advantage that it can be used for multiple flooding with various scenarios of the ceiling values. In addition, when exploring the graph through its dendrogram structure in order to calculate the flooding levels, independent sub-dendrograms are generated, thus offering a natural way for parallel processing. We provide an efficient im- plementation of our algorithm through suitable data structures and optimal organisation of the computations. Experimental results show that our algorithm outperforms well established classical algorithms, and reveal that the cost of building the dendrogram highly predominates over the total running time, thus validating both the efficiency and the hallmark of our method. Moreover, we exploit the potential parallelism exposed by the flooding procedure to design a multi-thread implementation. As the underlying parallelism is created on the fly, we use a queue to store the list of the sub-dendrograms to be explored, and then use a cyclic distribution to assign them to the participating threads. This yields a load balanced and scalable process as shown by additional benchmark results. Our program runs in few seconds on an ordinary computer to flood graphs with more that 20 millions of nodes.

A. Ferreira Leite, A. Boukerche, A. C. Magalhaes Alves de Melo, C. Eisenbeis, C. Tadonki, and C. Ghedini Ralha, ,
Power-Aware Server Consolidation for Federated Clouds,
J Concurrency and Computation: Practice and Experience (CCPE), ISSN: 1532-0626, Wiley Press, New York, USA., 2016.

Abstract. Cloud computing has evolved to provide computing resources on-demand through a virtualized infrastructure, letting applications, computing power, data storage, and network resources to be provisioned and managed over private networks or over the Internet. Cloud services normally run on large data centers and demand a huge amount of electricity. Consequently, the electricity cost represents one of the major concerns of data centers, since it is sometimes nonlinear with the capacity of the data centers, and it is also associated with a high amount of carbon emission (CO2). However, energy-saving schemes that result in too much degradation of the system performance or in violations of service-level agreement (SLA) parameters would eventually cause the users to move to another cloud provider. Thus, there is a need to reach a balance between energy savings and the costs incurred by these savings in the execution of the applications. Therefore, in this paper we propose and evaluate a power and SLA-aware application consolidation solution for cloud federations. It comprises a multi-agent system (MAS) for server consolidation, taking into account service-level agreement, power consumption, and carbon footprint. Different for similar solutions available in the literature, in our solution, when a cloud is overloaded its data center needs to negotiate with other data centers before migrating the workload to another cloud. Simulation results show that our approach can reduce up to 46% of the power consumption while trying to meet performance requirements. Furthermore, we show can provide an adequate solution to deal with power consumption in the clouds.

A. Ferreira Leite, V. Alves, G. Nunes Rodrigues, C. Tadonki, C. Eisenbeis, A. C. Magalhaes Alves de Melo,
Dohko: An Autonomic System for Provision, Configuration, and Management of Inter-Cloud Environments based on a Software Product Line Engineering Method,
Cluster Computing Special, 2017.

Abstract. Configuring and executing applications across multiple clouds is a challenging task due to the various terminologies used by the cloud providers. Therefore, we advocate the use of autonomic systems to do this work automatically. Thus, in this paper, we propose and evaluate Dohko, an autonomic and goal-oriented system for inter-cloud environments. Dohko implements self- configuration, self-healing, and context-awareness properties. Likewise, it relies on a hierarchical P2P overlay (a) to manage the virtual machines running on the clouds and (b) to deal with inter-cloud communication. Furthermore, it depends on a software product line engineering (SPLE) method to enable applications’ deployment and reconfiguration, without requiring pre-configured virtual machine images. Experimental results show that Dohko can free the users from the duty of executing non-native cloud application on single and over many clouds. In particular, it tackles the lack of middleware prototypes that can support different scenarios when using simultaneous services from multiple clouds.

Y. Samadi, M. Zbakh, C. Tadonki,
Performance comparison between Hadoop and Spark frameworks using Hibench benchmarks,
Concurrency and Computation: Practice and Experience (CCPE), 2017.

Abstract. Big data has become one of the major areas of research for cloud service providers due to a large amount of data produced every day, and the inefficiency of traditional algorithms and technologies to handle this large amounts of data. Big data with its characteristics such as Volume, Variety, and Veracity (3V) etc., requires efficient technologies to process in real-time. To solve this problem and to process and analyze this vast amount of data, there are many powerful tools like Hadoop and Spark, which are mainly used in the context of Big Data. They work following the principles of parallel computing. The challenge is to specify which Big Data’s tool is better depending on the processing context. In this paper, we present and discuss a performance comparison between two popular Big Data frameworks deployed on virtual machines. Hadoop MapReduce and Apache Spark are used to efficiently process a vast amount of data in parallel and distributed mode on large clusters, and both of them suit for Big Data processing. We also present the execution results of Apache Hadoop in Amazon EC2, a major Cloud Computing environment. To compare the performance of these two frameworks, we use HiBench benchmark suite, which is an experimental approach for measuring the effectiveness of any computer system. The comparison is made based on three criteria: execution time, throughput and speed up. We teste Wordcount workload with different data sizes for more accurate results. Our experimental results show that the performance of these frameworks varies significantly based on the use case implementation. Furthermore, from our results we draw the conclusion that Spark is more efficient than Hadoop to deal with a large amount of data in major cases. However, Spark requires higher memory allocation, since it loads the data to be processed into memory and keeps them in caches for a while, just like standard databases. So, the choice depends on performance level and memory constraints.

O. Haggui, C. Tadonki, L. Lacassagne, F. Sayadi, B. Ounid ,
Harris Corner Detection on a NUMA Manycore,
Future Generation Computer Systems (DOI: 10.1016/j.future.2018.01.048), 2018.

Abstract. Corner detection is a key kernel for many image processing procedures including pattern recognition and motion detection. The latter, for instance, mainly relies on the corner points for which spatial analyses are performed, typically on (probably live) videos or temporal flows of images. Thus, highly efficient corner detection is essential to meet the real-time requirement of associated applications. In this paper, we consider the corner detection algorithm proposed by Harris, whose the main work-flow is a composition of basic operators represented by their approximations using 3 × 3 matrices. The corresponding data access patterns follow a stencil model, which is known to require careful memory organization and management. Cache misses and other additional hindering factors with NUMA architectures need to be skillfully addressed in order to reach an efficient scalable implementation. In addition, with an increasingly wide vector registers, an efficient SIMD version should be designed and explicitly implemented. In this paper, we study a direct and explicit implementation of common and novel optimization strategies, and provide a NUMA-aware parallelization. Experimental results on a dual-socket INTEL Broadwell-E/EP show a noticeably good scalability performance.

Y. Samadi, M. Zbakh, and C. Tadonki,
Graph-based Model and Algorithm for Minimizing Big Data Movement in a Cloud Environment,
Int. J. High Performance Computing and Networking, 2018.

Abstract. In this paper, we discuss load balancing and data placement strategies in heterogeneous Cloud environments. Load balancing is crucial in large-scale data processing applications, especially in a distributed heterogeneous context like the Cloud. The main goal in data placement strategies is to improve the overall performance through the reduction of data movements among the participating datacenters, taking into account the dependencies. Typically, datacenters are geographically distributed based on theirs characteristics such as the processing speed, the storage capacity, among others technical considerations. Load balancing and efficient data placement on Cloud systems are critical problems, that are difficult to simultaneously cope with, especially in the emerging heterogeneous clusters. In this context, we propose a threshold-based load balancing algorithm, which first balances the load between datacenters, and afterwards minimizes the overhead of data exchanges. The proposed approach is divided into three phases. First, the dependencies between the datasets are identified. Second, the load threshold of each datacenter is estimated based on the processing speed and the storage capacity. Third, the load balancing between the datacenters is managed through the threshold parameters. The heterogeneity of the datacenters together with the dependencies between the datasets are both taken into account. Our experimental results show that our approach can efficiently reduce the frequency of data movement and keep a good load balancing between the datacenters.

Y. Samadi, M. Zbakh, and C. Tadonki,
DT-MG: many-to-one matching game for tasks scheduling towards resources optimization in cloud computing,
International Journal of Computers and Applications (DOI: 10.1080/1206212X.2018.1519630), 2018.

Abstract. The increasing demand of cloud computing motivates researchers to make cloud environments more efficient for its users and more profitable for the providers. More and more datacenters are being built to cater customers' needs. However, datacenters consume large amounts of energy, and this draws negative attention. Therefore, cloud providers are confronted with great pressures to reduce the energy consumed by datacenters. To address this issue, efficient algorithms to reduce energy consumption and to guarantee the quality of service are needed. In this paper, we propose a load balancing algorithm named DT-MG, which aims to reduce energy consumption and maximize the efficiency of the available resources. First, we used the Matching Game Theory model for assigning tasks to datacenters. We then study the optimal operation of the resources by migrating all the tasks of the physical machine under sub-regime to other physical machine, followed by their systematic switch to standby mode. Experimental results prove that the proposed approach reduces energy consumption and the number of task migration while maintaining the service level agreement in comparison with some existing techniques.

A. Susungi and C. Tadonki,
Intermediate Representations for Explicitly Parallel Programs,
ACM Computing Surveys, Volume 54, Issue 5 (DOI: https://doi.org/10.1145/3452299), May 2021.

Abstract. While compilers generally support parallel programming languages and APIs, their internal program representations are mostly designed from the sequential programs standpoint (exceptions include source-to-source parallel compilers, for instance). This makes the integration of compilation techniques dedicated to parallel programs more challenging. In addition, parallelism has various levels and different targets, each of them with specific characteristics and constraints. With the advent of multi-core processors and general purpose accelerators, parallel computing is now a common and pervasive consideration. Thus, software support to parallel programming activities is essential to make this technical transition more realistic and beneficial. The case of compilers is fundamental as they deal with (parallel) programs at a structural level, thus the need for intermediate representations. This article surveys and discusses attempts to provide intermediate representations for the proper support of explicitly parallel programs. We highlight the gap between available contributions and their concrete implementation in compilers and then exhibit possible future research directions.

L. Bouhouch and C. Tadonki, M. Zbakh,
Dynamic Data Replication and Placement Strategy in Geographically Distributed Data centers,
Concurrency and Computation: Practice and Experience (CCPE) - 10.1002/cpe.6858 , 2022

Abstract. With the evolution of geographically distributed data centers in the Cloud Computing landscape along with the amount of data being processed in these data centers, which is growing at an exponential rate, processing massive data applications become an important topic. Since a given task may require many datasets for its execution and the datasets are spread over several different data centers, finding an efficient way to manage the datasets storage across nodes of a Cloud system is a difficult problem. In fact, the execution time of a task might be influenced by the cost of data transfers, which mainly depends on two criterias. The first one is the initial placement of the input datasets during the build-time phase, while the second is the replication of the datasets during the runtime phase. The replication is explicitly consider when datasets are being migrated over the data centers in order to make them locally available wherever needed. Data placement and data replication are important challenges in Cloud Computing. Nevertheless, many studies focus on data placement or data replication exclusively. In this paper, a combination of a data placement strategy followed by a dynamic data replication management strategy is proposed, with the purpose of reducing the associated cost of all data transfers between the (distant) data centers. Our proposed data placement approach considers the main characteristics of a data center such as storage capacity and read/write speeds to efficiently store the datasets, while our dynamic data replication management approach considers three parameters: the number of replicas in the system, the dependency between datasets and tasks and the storage capacity of data centers. The decision of when and whether to keep or to delete replicas is determined by the fulfillment of those three parameters. Our approach estimates the total execution time of the tasks as well as the monetary cost, considering the data transfers activity. Our experiments are conducted using Cloudsim simulator. The obtained results show that our proposed strategies produce an efficient data management by reducing the overheads of the data transfers, compared to both a data placement without replication (by 76%) and the selected data replication approach from Kouidri et al. (by 52%), and by improving the financial cost.

Alan L. Nunes, Alba Melo, C. Tadonki,Cristina Boeres, Daniel de Oliveira, Lucia M. de Assumpcao,
Optimizing computational costs of Spark for SARS-CoV-2 sequences comparisons on a commercial cloud,
Concurrency and Computation: Practice and Experience (CCPE) - DOI:10.1002/cpe.7678 , March 2023

Abstract. Cloud computing is currently one of the prime choices in the computing infrastructure landscape. In addition to advantages such as the pay-per-use bill model and resource elasticity, there are technical benefits regarding heterogeneity and large-scale configuration. Alongside the classical need for performance, for example, time, space, and energy, there is an interest in the financial cost that might come from budget constraints. Based on scalability considerations and the pricing model of traditional public clouds, a reasonable optimization strategy output could be the most suitable configuration of virtual machines to run a specific workload. From the perspective of runtime and monetary cost optimizations, we provide the adaptation of a Hadoop applications execution cost model extracted from the literature aiming at Spark applications modeled with the MapReduce paradigm. We evaluate our optimizer model executing an improved version of the Diff Sequences Spark application to perform SARS-CoV-2 coronavirus pairwise sequence comparisons using the AWS EC2's virtual machine instances. The experimental results with our model outperformed 80% of the random resource selection scenarios. By only employing spot worker nodes exposed to revocation scenarios rather than on-demand workers, we obtained an average monetary cost reduction of 35.66% with a slight runtime increase of 3.36%.

Bouhouch L., Zbackh M., C. Tadonki,
Online Task Scheduling of Big Data Applications in the Cloud Environment,
Information 2023, 14(5), 292 - DOI:10.3390/info14050292 , May 2023

Abstract. The development of big data has generated data-intensive tasks that are usually time-consuming, with a high demand on cloud data centers for hosting big data applications. It becomes necessary to consider both data and task management to find the optimal resource allocation scheme, which is a challenging research issue. In this paper, we address the problem of online task scheduling combined with data migration and replication in order to reduce the overall response time as well as ensure that the available resources are efficiently used. We introduce a new scheduling technique, named Online Task Scheduling algorithm based on Data Migration and Data Replication (OTS-DMDR). The main objective is to efficiently assign online incoming tasks to the available servers while considering the access time of the required datasets and their replicas, the execution time of the task in different machines, and the computational power of each machine. The core idea is to achieve better data locality by performing an effective data migration while handling replicas. As a result, the overall response time of the online tasks is reduced, and the throughput is improved with enhanced machine resource utilization. To validate the performance of the proposed scheduling method, we run in-depth simulations with various scenarios and the results show that our proposed strategy performs better than the other existing approaches. In fact, it reduces the response time by 78% when compared to the First Come First Served scheduler (FCFS), by 58% compared to the Delay Scheduling, and by 46% compared to the technique of Li et al. Consequently, the present OTS-DMDR method is very effective and convenient for the problem of online task scheduling.

Bouhouch L., Zbackh M., C. Tadonki,
DFMCloudsim: an extension of cloudsim for modeling and simulation of data fragments migration over distributed data centers,
International Journal of Computers and Applications, 46(1), pp. 1-20, DOI:10.1080/1206212X.2023.2277554 , Jan 2024

Abstract. Due to the increasing volume of data for applications running on geographically distributed Cloud systems, the need for efficient data management has emerged as a crucial performance factor. Alongside basic task scheduling, the management of input data on distributed Cloud systems has become a genuine challenge, particularly with data-intensive applications. Ideally, each dataset should be stored in the same data center as its consumer tasks so as to lead to local data accesses only. However, when a given task does not need all items within one of its input datasets, sending that dataset entirely might lead to a severe time overhead. To address this concern, a data fragmentation strategy can be considered in order to partition the datasets and process them in that form. Such a strategy should be flexible enough to support any user-defined partitioning, and suitable enough to minimize the overhead of transferring the data in their fragmented form. To simulate and estimate the basic statistics of both fragmentation and migration mechanisms prior to an implementation in a real Cloud, we chose Cloudsim, with the goal of enhancing it with the corresponding extensions. Cloudsim is a popular simulator for Cloud Computing investigations. Our proposed extension is named DFMCloudsim, its goal is to provide an efficient module for implementing fragmentation and data migration strategies. We validate our extension using various simulated scenarios. The results indicate that our extension effectively achieves its main objectives and can reduce data transfer overhead by 74.75% compared to our previous work.

Carla Santana; Ramon C.F. Araujo; Idalmis Milian Sardina; Italo A.S. de Assis; Tiago Barros; Calebe P. Bianchini; Antonio D. de S. Oliveira; Joao M. de Araujo; Herve Chauris; C. Tadonki; Samuel Xavier-de-Souza,
DeLIA: A Dependability Library for Iterative Applications applied to parallel geophysical problems,
Computers & Geosciences, DOI:10.1016/j.cageo.2024.105662 , June 2024

Abstract. Many geophysical imaging applications, such as full-waveform inversion, often rely on high-performance computing to meet their demanding computational requirements. The failure of a subset of computer nodes during the execution of such applications can have a significant impact, as it may take several days or even weeks to recover the lost computation. To mitigate the consequences of these failures, it is crucial to employ effective fault tolerance techniques that do not introduce substantial overhead or hinder code optimization efforts. This paper addresses the primary research challenge of developing fault tolerance techniques with minimal impact on execution and optimization. To achieve this, we propose DeLIA, a Dependability Library for Iterative Applications designed for parallel programs that require data synchronization among all processes to maintain a globally consistent state after each iteration. DeLIA efficiently performs checkpointing and rollback of both the application’s global state and each process’s local state. Furthermore, DeLIA incorporates interruption detection mechanisms. One of the key advantages of DeLIA is its flexibility, allowing users to configure various parameters such as checkpointing frequency, selection of data to be saved, and the specific fault tolerance techniques to be applied. To validate the effectiveness of DeLIA, we applied it to a 3D full-waveform inversion code and conducted experiments to measure its overhead under different configurations using two workload schedulers. We also analyzed its behavior in preemptive circumstances. Our experiments revealed a maximum overhead of 8.8%, and DeLIA demonstrated its capability to detect termination signals and save the state of nodes in preemptive scenarios. Overall, the results of our study demonstrate the suitability of DeLIA to provide fault tolerance for iterative parallel applications.

Roblex Nana Tchakoute; C. Tadonki; Petr Dokladal; Youssef Mesri,
Benchmark-Based Study of CPU/GPU Power-Related Features Through JAX and TensorFlow,
IEEE Access, DOI:10.1109/ACCESS.2025.3625414 , October 2025.

Abstract. Energy has become a critical resource in the modern computing landscape, making power management a central focus in High-Performance Computing (HPC) and Artificial Intelligence (AI). While power management techniques like Dynamic Voltage and Frequency Scaling (DVFS), Power Capping, and ACPI/P-State CPU governors are well-established, their effectiveness is significantly influenced by the high-level structure of software frameworks. This paper presents a comprehensive empirical study of this interplay, evaluating the three aforementioned power management techniques on a dual-socket Intel Xeon “Ice Lake” CPU, a single-socket AMD EPYC “Zen3” CPU, and an NVIDIA A100 GPU. We run a suite of computational kernels using both TensorFlow and JAX to expose how framework-specific design choices mediate hardware-level power controls. Our results reveal that the best strategy for energy efficiency is highly context-dependent and relies on the specific combination of hardware, workload, and framework. We find that DVFS is the most effective on both Intel Xeon and AMD EPYC platforms, delivering significant Energy-Delay Product (EDP) reductions with minimal performance loss. In contrast, Power Capping is the most efficient technique for NVIDIA A100. A key finding is the notable influence of the software stack; for instance, JAX exhibits operational instability at the lowest GPU frequencies on the A100, while there is no limitation with TensorFlow under identical conditions. Our findings provide operational platform-specific guidance for practitioners, expose crucial robustness considerations for framework developers, and highlight the necessity of considering the software stack as an active variable in energy-aware computing.

C. Tadonki; Gabriele Mencagli; Leonel Sousa ,
Leveraging Cutting-Edge High Performance Computing for Large-Scale Applications,
Future Generation Computer Systems, DOI:10.1016/j.future.2026.108374 , January 2026.

Abstract. High Performance Computing (HPC) recently entered into the exascale era, marking an important milestone of its history. High-end supercomputers and clusters with remarkable levels of performance are now commonly available for general and specific computational needs, thereby increasing the focus on HPC and related topics. Leveraging the potential of high-speed processing units is an HPC skillful task that requires in-depth knowledge in both hardware and software domains. In fact, the architectural structure of cutting-edge HPC processors is complex and involves several specialized features provided through specific units/mechanisms, the processing constraint/overhead of which can turn out to be an efficiency bottleneck. Large-scale supercomputers present greater challenges due to the significant overhead associated with interprocessor communication and synchronization. The evolution of HPC appears closely tied to the growing demand for speed from large-scale applications like complex combinatorial problems, big data applications, the training of large-scale AI models and high-precision simulations, to name a few. As a result, the implementation of cutting-edge techniques should remain scalable on large-scale machines for the benefit of end-users.

Claude Tadonki,
Système d'équations récurrentes et multiplication parallèle d'un vecteur par un produit tensoriel de matrices,
Rencontres Francophones de Parallelisme Renpar'11, Rennes (France), 1999.

Sanjay Rajopadhye, Tanguy Risset, et Claude Tadonki,
The algebraic path problem revisited,
European Conference on Parallel Computing Europar99, Toulouse (France), Lncs Sringer-Verlag, N° 1685, p. 698-707, August 1999.

Claude Tadonki,
Ordonnancements canoniques,
Renpar12, Rencontres Francophones de Parallelisme, Besançon (France), Juin 2000.

Claude Tadonki,
Parallel Cholesky Factorization,
Parallel Matrix Algorithms and Appliations PMAA Worshop, Neuchatel (Switzerland), August 2000.

Claude Tadonki, et Bernard Philippe,
Méthodologie de conception d'algorithmes efficaces pour le produit tensoriel,
CARI2000, Tananarive (Madagascar), Octobre 2000.

Patrice Quinton, Claude Tadonki, et Maurice Tchuente,
Un échéancier systolique et son utilisation dans l'ATM,
CARI2000, Tananarive (Madagascar), Octobre 2000.

Claude Tadonki,
Complexité des ordonnancements canoniques et dérivation d'architecture,
Rencontres Francophones de Parallelisme Renpar13, Paris (France), Avril 2001 ( get it! ).

Claude Tadonki,
A Recursive Method for Graph Scheduling,
International Symposium on Parallel and Distributed Computing (SPDC), Iasi, Romania, July 2002 ( get it! ).

R. Ndoundam, C. Tadonki, and M. Tchuente,
Parallel chip firing game associated with n-cube orientation,
International Conference on Computational Science, ICCS04 (LNCS/Springer), Krakow, Poland, June 2004 .

T. Saidani, J. Falcou, C. Tadonki, L. Lacassagne, and D. Etiemble,
Algorithmic Skeletons within an Embedded Domain Specific Language for the CELL Processor,
Parallel Architectures and Compilation Techniques (PACT), PACT09, Raleigh, North Carolina (USA), September 12-16, 2009. (pdf)

C. Tadonki, G. Grosdidier, and O. Pene,
An efficient CELL library for Lattice Quantum Chromodynamics,
International Workshop on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) in conjunction with the 24th ACM International Conference on Supercomputing (ICS), pp. 67-71, Epochal Tsukuba, Tsukuba, Japan, June 1-4, 2010. (ACM Computer Architecture News)

C. Tadonki, L. Lacassagne T. Saidani, J. Falcou, K. Hamidouche,
The Harris algorithm revisited on the CELL processor ,
International Workshop on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) in conjunction with the 24th ACM International Conference on Supercomputing (ICS), pp. 97-100, Epochal Tsukuba, Tsukuba, Japan, June 1-4, 2010. (ACM Computer Architecture News)

C. Tadonki,
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine,
Workshop on Applications for Multi and Many Core Architectures (WAMMCA 2010) in conjunction with the International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010), Petropolis, Rio de Janeiro, Brazil, October 27-30, 2010. (IEEE digital library) - abstract - slides - pdf - code

C. Tadonki,
Large Scale Kronecker Product on Supercomputers,
2nd Workshop on Architecture and Multi-Core Applications (WAMCA 2011) in conjunction with the International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011), Vitoria, Espirito Santo, Brazil, October 26-29, 2011. (IEEE digital library) - abstract - slides - pdf - code

D. Barthou, G. Grosdidier, M. Kruse, O. Pene and C. Tadonki,
QIRAL: A High Level Language for Lattice QCD Code Generation,
Programming Language Approaches to Concurrency and Communication-cEntric Software (PLACES'12) in conjunction with the European joint Conference on Theory & Practice of Software (ETAPS), Tallinn, Estonia, March 24-April 1, 2012.

C. Tadonki,
Basic parallel and distributed computing curriculum,
Second NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar'12) in conjunction with the 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, May 21-25, 2012.

C. Tadonki, L. Lacassagne, E. Dadi, M. Daoudi
Accelerator-based implementation of the Harris algorithm,
5th International Conference on Image Processing (ICISP 2012), Agadir, Morocco, June 28-30, 2012.

P.-L. Caruana and C. Tadonki
Seamless Parallelism in MATLAB,
Parallel Distributed Computing and Networks, Innsbruck, Austria, Feb 16-18, 2014.

F. Meyer, C. Tadonki, and F. Irigoin
Dendrogram Based Algorithm for Dominated Graph Flooding,
International Conference on Computational Science (ICCS 2014), Cairns, Australia, June 10-12, 2014.

A. Susungi, A. Cohen, and C. Tadonki,
More Data Locality for Static Control Programs on NUMA Architectures,
7th International Workshop on Polyhedral Compilation Techniques (IMPACT 2017), Stockholm, Sweden, January 23, 2017.

C. Tadonki,
Scalable NUMA-Aware Wilson-Dirac on Supercomputers,
International Conference on High Performance Computing & Simulation (HPCS 2017), Genoa, Italy, July 17-21, 2017.

A. Susungi, N. A. Rink, J. Castrillon, I. Huismann, A. Cohen, C. Tadonki, J. Stiller, J. Frohlich,
Towards Compositional and Generative Tensor Optimizations,
16th International Conference on Generative Programming: Concepts & Experience (GPCE 2017), Vancouver, Canada, October 23-24 2017.

N. A. Rink, A. Susungi, J. Castrillon, I. Huismann, A. Cohen, . Stiller, and C. Tadonki,
CFDlang: High-level code generation for high-order methods in fluid dynamics,
International Workshop on Real World Domain Specific Languages 2018 (RWDSL 2018) in conjunction with the CGO'18 international symposium on Code Generation and Optimisation, DOI10.1145/3183895.3183900, Vienna, Austria, February 24, 2018.

O. Haggui, C. Tadonki, F. Sayadi, B. Ouni,
Evaluation of an OpenMP Parallelization of Lucas-Kanade on a NUMA-Manycore,
9th Workshop on Architecture and Multi-Core Applications (WAMCA 2018) in conjunction with the 30th International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011), Ecole Nationale Superieur de Lyon, Lyon, France, September 24-27, 2018.

A. Susungi, N. A. Rink, A. Cohen, J. Castrillon, C. Tadonki,
Meta-programming for Cross-Domain Tensor Optimizations,
17th International Conference on Generative Programming: Concepts & Experience (GPCE 2017) - (copy of the paper), Boston - Massachusetts, USA, November 5-6 2018.

O. Haggui, C. Tadonki, F. Sayadi, B. Ouni,
Efficient GPU Implementation of Lucas-Kanade through OpenACC,
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2019), Prague, Czech Republic, February 25-27, 2019.

O. Haggui, C. Tadonki, F. Sayadi, B. Ouni,
Memory Efficient Deployment of an Optical Flow Algorithm on GPU Using OpenMP,
20th International Conference on Image Analysis AND Processing ( ICIAP 2019), Trento, Italy, 9-13 September, 2019.

J.F.D Souza, L.S.F. Machado, E. Gomi, C. Tadonki, S. McIntosh-Smith and H. Senger,
Performance of OpenMP loop transformations for the acoustic wave stencil on GPUs,
The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'22), Dallas, USA, November 13–18, 2022.

Leticia Suellen Farias Machado, Claude Tadonki, Hermes Senger,
A Source-to-Source NUMA Profiling Approach,
WAMCA2023 - 35th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2023, Porto Alegre, Brazil, October 17-20, 2023.

Roblex Nana Tchakoute, Claude Tadonki,
Experimental Study of Power Consumption of Basic Parallel Programs,
WAMCA2024 - 36th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2024, Hilo-Hawaii, USA, November 13-15, 2024.

Roblex Nana Tchakoute, Claude Tadonki,Petr Dokladal, Youssef Mesri
A Flexible Operational Framework for Energy Profiling of Programs,
WAMCA2024 - 36th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2024, Hilo-Hawaii, USA, November 13-15, 2024.

Roblex Nana Tchakoute, Claude Tadonki,Petr Dokladal, Youssef Mesri
A Framework for Analytical Performance and Energy Prediction of DL Training on GPUs,
37th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2025, Bonito/MS, BRAZIL, October 28-31, 2025.

Roblex Nana Tchakoute and Claude Tadonki
Energy-Aware Deep Learning on GPUs through Parameter Sharing and Mixed Precision Training,
LeanDL-HPC 2025: Workshop on Lightweight and Efficient Deep Learning in HPC Environments - 37th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2025, Bonito/MS, BRAZIL, October 28-31, 2025.

Chahinèze Ztoti, Claude Tadonki,Roblex Nana Tchakoute, Hervé Chauris
Efficient SIMD and Shared-Memory Parallelization of 3D Acoustic Wave Propagation Simulation,
WAMCA2025: Workshop on Applications for Multi-Core Architectures - 37th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2025, Bonito/MS, BRAZIL, October 28-31, 2025.

Andres Giraldo-Morales, Cristiana Bentes, Maria Clicia Castro, Gilson Costa, Claude Tadonki
ROPH: A Robust, Optimized, and Parallelized Harris Detector with Flexible FAST-Based Pruning,
WAMCA2025: Workshop on Applications for Multi-Core Architectures - 37th IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2025, Bonito/MS, BRAZIL, October 28-31, 2025.

Roblex Nana Tchakoute and C. Tadonki,
EAS-Sim: A Framework and its Methodology for the Co-Design of Multi-Objective, Energy-Aware Schedulers for AI Clusters,
Sustainable Supercomputing Workshop - The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'25), St. Louis/MO, USA, November 16-21, 2025.

Imane Ettifouri, Mostapha Zbakh, Claude Tadonki,
Experimental Evaluation and Analysis of Gradient Descent in Machine Learning,
The 7th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech'25), Rabat, MOROCCO, November 25-27, 2025.

L. Drouet, A. Dubois, A. Haurie and C. Tadonki,
A MARKAL-Lite Model for Sustainable Urban Transportation,
Optimization days, Montreal, Canada, May, 2003.

Claude Tadonki,
ProxAccpm: A convex optimization solver,
International Symposium on Mathematical Programing, ISMP2003, Copengagen, Danmark, August 2003 ( get it! ).

O. Briant, C. Lemarechal,K. Monneris,N. Perrot,C. Tadonki,F. Vanderbeck,J.-P. Vial,C. Beltran,P. Meurdesoif,
Comparison of various approaches for column generation,
Eigth Aussois Workshop on Combinatorial Optimization, 5-9 january 2004.

Claude Tadonki and Jean-Philippe Vial,
Efficient algorithm for linear pattern separation,
International Conference on Computational Science, ICCS04 (LNCS/Springer), Krakow, Poland, June 2004 .

Cesar Beltran, Claude Tadonki, Jean-Philippe Vial,
Semi-Lagrangian relaxation ,
Computational Management Science Conference and Workshop on Computational Econometrics and Statistics, Link, Neuchatel, Switzerland, April 2004 .

Claude Tadonki, Cesar Beltran and Jean-Philippe Vial ,
Portfolio management with integrality constraints,
Computational Management Science Conference and Workshop on Computational Econometrics and Statistics, Link, Neuchatel, Switzerland, April 2004 .

C. Beltran, C. Tadonki and J.-Ph. Vial,
The p-median problem solved by semi-Lagrangian relaxation,
First Mathematical Programming Society International Conference on Continuous Optimization (ICCOPT I), Troy, USA, August 2-4, 2004.

Claude Tadonki, Mitali Singh, Jose Rolim and Viktor K. Prasanna,
Combinatorial Techniques for Memory Power State Scheduling in Energy Constrained Systems,
Workshop on Approximation and Online Algorithms
(WAOA), WAOA2003 (LNCS/Springer), Budapest, Hungary, September 2003 .

Claude Tadonki and Jose Rolim ,
An analytical model for energy minimization,
III Workshop on Efficient and Experimental Algorithms, WEA04 (LNCS/Springer), Angra dos Reis, Rio de Janeiro, Brazil, May 2004.

Claude Tadonki ,
Universal Report: A Generic Reverse Engineering Tool ,
12th IEEE International Workshop on Program Comprehension, IWPC 2004 (IEEE), University of Bari, Bari, Italy , June 2004 .

Claude Tadonki and Jose Rolim,
An integer programming heuristic for the dual power management problem in wireless sensor networks,
2nd International Workshop on Managing Ubiquitous Communications and Services, MUCS2004, Dublin, Ireland, December 13, 2004.

Claude Tadonki,
Refinement experiments with RADDAM data,
EMBL bilateral meeting, Hamburg, Germany, June 26-28, 2006.

Claude Tadonki,
Off-line settings in wireless networks,
3rd International Symposium on Computational Intelligence and Intelligent Informatics, ISCIII2007, Agadir, Morocco, March 28-30, 2007.

E. Dadi, M. Daoudi, C. Tadonki
3D Shape Retrieval using Bag-of-feature method basing on local codebooks,
5th International Conference on Image Processing (ICISP 2012), Agadir, Morocco, June 28-30, 2012.

E. Dadi, M. Daoudi, C. Tadonki
Fast 3D shape retrieval method for classified databases,
International Conference on Complex Systems (ICCS'12), Agadir, Morocco, November 5-6, 2012.

A. Leite, C. Tadonki, C. Eisenbeis, T. Raiol, M.E. Walter, and A. de Melo
Excalibur: An Autonomic Cloud Architecture for Executing Parallel Applications,
Fourth International Workshop on Cloud Data and Platforms (CloudDP 2014), Amsterdam, Netherlands, April 13, 2014.

A. Leite, C. Tadonki, C. Eisenbeis, and A. de Melo
A Fine-grained Approach for Power Consumption Analysis and Prediction,
International Conference on Computational Science (ICCS 2014), Cairns, Australia, June 10-12, 2014.

A. F. Leite, V. Alves, G. N. Rodrigues, C. Tadonki, C. Eisenbeis, A. C. M. A. de Melo
Automating Resource Selection and Configuration in Inter-clouds through a Software Product Line Method,
8th IEEE International Conference on Cloud Computing, CLOUD 2015, New York City, NY, USA, June 27 - July 2, 2015.

Y. Samadi, M. Zbakh, C. Tadonki
Comparative study between Hadoop and Spark based on Hibench benchmarks,
2nd International Conference on Cloud Computing Technologies and Applications (CloudTech 2016), Marrakesh, Morocco, 24-26 May, 2016.

A. F. Leite, V. Alves, G. N. Rodrigues, C. Tadonki, C. Eisenbeis, A. C. M. A. de Melo
ADohko: An Autonomic System for Provision, Configuration, and Management of Inter- Cloud Environments based on a Software Product Line Engineering Method,
IEEE International Conference on Cloud and Autonomic Computing, CICCAC 2016, Augsburg, Germany, September 12-16, 2016.

Yassir Samadi, Mostapha Zbakh, and C. Tadonki,
E-HEFT: Enhancement Heterogeneous Earliest Finish Time algorithm for Task Scheduling based on Load Balancing in Cloud Computing, International Conference on High Performance Computing & Simulation (HPCS 2018), Orleans, France, July 16-20, 2018

Claude Tadonki, Système d'équations récurrentes et multiplication parallèle d'un vecteur par un produit tensoriel de matrices, Rencontres Francophones de Parallelisme Renpar'11, Rennes (France), 1999.

Sanjay Rajopadhye, Tanguy Risset, et Claude Tadonki, The algebraic path problem revisited, European Conference on Parallel Computing Europar99, Toulouse (France), Lncs Sringer-Verlag, N° 1685, p. 698-707, August 1999.

Claude Tadonki, Ordonnancements canoniques, Renpar12, Rencontres Francophones de Parallelisme, Besançon (France), Juin 2000.

Claude Tadonki, Parallel Cholesky Factorization, Parallel Matrix Algorithms and Appliations PMAA Worshop, Neuchatel (Switzerland), August 2000.

Claude Tadonki, et Bernard Philippe, Méthodologie de conception d'algorithmes efficaces pour le produit tensoriel, CARI2000, Tananarive (Madagascar), Octobre 2000.

Patrice Quinton, Claude Tadonki, et Maurice Tchuente, Un échéancier systolique et son utilisation dans l'ATM, CARI2000, Tananarive (Madagascar), Octobre 2000.

Claude Tadonki, Complexité des ordonnancements canoniques et dérivation d'architecture, Rencontres Francophones de Parallelisme Renpar13, Paris (France), Avril 2001 ( get it! ).

Claude Tadonki, A Recursive Method for Graph Scheduling, International Symposium on Parallel and Distributed Computing (SPDC), Iasi, Romania, July 2002 ( get it! ) .

L. Drouet, A. Dubois, A. Haurie and C. Tadonki, A MARKAL-Lite Model for Sustainable Urban Transportation, Optimization days, Montreal, Canada, May, 2003.

Claude Tadonki, ProxAccpm: A convex optimization solver, International Symposium on Mathematical Programing, ISMP2003, Copengagen, Danmark, August 2003 ( get it! ).

Claude Tadonki, Mitali Singh, Jose Rolim and Viktor K. Prasanna, Combinatorial Techniques for Memory Power State Scheduling in Energy Constrained Systems, Workshop on Approximation and Online Algorithms (WAOA), WAOA2003 (LNCS/Springer), Budapest, Hungary, September 2003 .

O. Briant, C. Lemaréchal,K. Monneris,N. Perrot,C. Tadonki,F. Vanderbeck,J.-P. Vial,C. Beltran,P. Meurdesoif, Comparison of various approaches for column generation, Eigth Aussois Workshop on Combinatorial Optimization, 5-9 january 2004.

Claude Tadonki and Jean-Philippe Vial, Efficient algorithm for linear pattern separation, International Conference on Computational Science, ICCS04 (LNCS/Springer), Krakow, Poland, June 2004 .

R. Ndoundam, C. Tadonki, and M. Tchuente, Parallel chip firing game associated with n-cube orientation, International Conference on Computational Science, ICCS04 (LNCS/Springer), Krakow, Poland, June 2004 .

Cesar Beltran, Claude Tadonki, Jean-Philippe Vial, Semi-Lagrangian relaxation , Computational Management Science Conference and Workshop on Computational Econometrics and Statistics, Link, Neuchatel, Switzerland, April 2004 .

Claude Tadonki, Cesar Beltran and Jean-Philippe Vial , Portfolio management with integrality constraints, Computational Management Science Conference and Workshop on Computational Econometrics and Statistics, Link, Neuchatel, Switzerland, April 2004 .

Claude Tadonki and Jose Rolim , An analytical model for energy minimization, III Workshop on Efficient and Experimental Algorithms, WEA04 (LNCS/Springer), Angra dos Reis, Rio de Janeiro, Brazil, May 2004.

C. Beltran, C. Tadonki and J.-Ph. Vial, The p-median problem solved by semi-Lagrangian relaxation, First Mathematical Programming Society International Conference on Continuous Optimization (ICCOPT I), Troy, USA, August 2-4, 2004.

Claude Tadonki , Universal Report: A Generic Reverse Engineering Tool , 12th IEEE International Workshop on Program Comprehension, IWPC 2004 (IEEE), University of Bari, Bari, Italy , June 2004 .

Claude Tadonki and Jose Rolim, An integer programming heuristic for the dual power management problem in wireless sensor networks, 2nd International Workshop on Managing Ubiquitous Communications and Services, MUCS2004, Dublin, Ireland, December 13, 2004.

Claude Tadonki, Refinement experiments with RADDAM data, EMBL bilateral meeting, Hamburg, Germany, June 26-28, 2006.

Claude Tadonki, Off-line settings in wireless networks, 3rd International Symposium on Computational Intelligence and Intelligent Informatics, ISCIII2007, Agadir, Morocco, March 28-30, 2007.

Claude Tadonki, and Bernard Philippe, Parallel multiplication of a vector by a Kronecker product of matrices, IRISA report n° 1194, 1998.
Optimal Parallel Algorithm for the Kronecker Product in log(p) communication steps

Patrice Quinton, Claude Tadonki, et Maurice Tchuente, Un échéancier systolique et son utilisation dans l'ATM, IRISA report n° 1348, 2000.

Claude Tadonki, Synthèse d'ordonnancements parallèles par reproduction canonique, IRISA report n° 1349, also INRIA report n° 3996, 2000.

David Cachera, Sanjay Rajopadhye, Tanguy Risset, and Claude Tadonki, Parallelization of the algebraic path problem on linear simd/spmd arrays, IRISA report n° 1409, 2001.

Claude Tadonki and Jean-Philippe Vial, The linear separation problem revisited with accpm , Cahier de Recherche n° 2002.11, University of Geneva, June 2002. ( get it! )

F. Babonneau, C. Beltran, O. du Merle, C. Tadonki and J.-P. Vial, The proximal analytic center cutting plane method, Technical report, Logilab, HEC, University of Geneva, 2003.

Cesar Beltran, Claude Tadonki, and Jean-philippe Vial, Semi-Lagrangian relaxation, Technical report, Logilab, HEC, University of Geneva, 2004.

Ordonnancement pour l'informatique parallele, A. Moukrim and C. Picouleau (Edt), Hermes, ( details! ).

Patryk Kiepas, Corinne Ancourt, Claude Tadonki, Jaroslaw Kozlak, Using Performance Event Profiles to Deduce an Execution Model of MATLAB with Just-In-Time Compilation, DOI: 10.1007/978-3-030-72789-5_6 In book: Languages and Compilers for Parallel Computing , March 2021.

Yassir Samadi, Mostapha Zbakh, Claude Tadonki, Workflow Scheduling Issues and Techniques in Cloud Computing: A Systematic Literature Review, DOI: 10.1007/978-3-319-97719-5_16 In book: Cloud Computing and Big Data: Technologies, Applications and Security , January 2019.

Claude Tadonki, What Do HPC Applications Look Like?, DOI: https://doi.org/10.1007/978-3-031-29769-4_3 In book: High Performance Computing in Clouds : Moving HPC Applications to a Scalable and Cost-Effective Environment , Springer International Publishing, 17 March 2023.

Juan Antonio Lossio-Ventura, Eduardo Ceh-Varela, Genoveva Vargas-Solar, Ricardo Marcacini,Claude Tadonki, Hiram Calvo, Hugo Alatrista-Salas (Eds), Information Management and Big Data, DOI: https://doi.org/10.1007/978-3-031-63616-5 Book: Information Management and Bid Data , Communications in Computer and Information Science, Springer Charm, 28 June 2024.

Mostapha Zbakh, Mohammed Essaaidi,Claude Tadonki, Abdellah Touhafi, Dhabaleswar K. Panda (Eds), Artificial Intelligence and High Performance Computing in the Cloud, DOI: https://doi.org/10.1007/978-3-031-78698-3 Book: Artificial Intelligence and High Performance Computing in the Cloud , Lecture Notes in Networks and Systems, Springer Charm, 2024.

Nana, R.T., Tadonki, C., Dokladal, P., Mesri, Y. (2024). Power Consumption in HPC-AI Systems. In: Zbakh, M., Essaaidi, M., Tadonki, C., Touhafi, A., Panda, D.K. (eds) Artificial Intelligence and High Performance Computing in the Cloud. CloudTech 2023. Lecture Notes in Networks and Systems, vol 1220. Springer, Cham. https://doi.org/10.1007/978-3-031-78698-3_6

Ettifouri, I., Zbakh, M., Tadonki, C. (2024). The Need for HPC in AI Solutions. In: Zbakh, M., Essaaidi, M., Tadonki, C., Touhafi, A., Panda, D.K. (eds) Artificial Intelligence and High Performance Computing in the Cloud. CloudTech 2023. Lecture Notes in Networks and Systems, vol 1220. Springer, Cham. https://doi.org/10.1007/978-3-031-78698-3_8

A. Dubois, A. Haurie, C. Tadonki, and D. Zachary, An Operational Energy Modelling System, 2003 ( get it! )

R. Ndoundam, C. Tadonki, and M. Tchuente, Parallel chip firing game associated with n-cube orientation, 2000 ( get it! ).

F. Babonneau, C. Beltran, O. du Merle, C. Tadonki, and J.-P. Vial, The proximal analytic center cutting plane method , 2003 ( get it! ).

C. Tadonki, Universal Report: A generic reverse engeneering tool, 2003 ( get it! ).

C. Tadonki, M. Singh, R. Jose, and V. Prasanna, Combinatorial technic for memory power state scheduling in energy-constrained system, 2003 ( get it! ).

D. Cachera, S. Rajopadye, T. Risset, and C. Tadonki, Algorithmic tiling for efficient parallel APP implementation, 2003 ( get it! ).

C. Tadonki and J.-P. Vial, Portfilio Selection with Cardinality and Bound Constraint, 2003 ( get it! ).

C. Tadonki and J. Rolim, An analytical model for energy minimization, 2003 ( get it! ).

C. Tadonki and J. Rolim, An integer programming heuristic for the dual power management problem in wireless sensor networks, 2004 ( get it! ).

Portfolio Selection with Cardinality Constraints

Efficient Matrix Computations in Cutting Planes Algorithms

Algorithmic Technics of Designing Energy Efficient Algorithms

Structural Method for Lower Bounds in Complexity

Improved Graph Model for Optical Network of Sensors

Dynamic Behavior of Parallel Chip Firing Game on Regular Graphs

Chapter of a book on "Parallel Scheduling" published by Hermes

Book in "Scientific Computation"