The Steps to Build a Robust Clinical Application

Doctor explaining ct scan

Validation is a crucial step before a medical device (MD) is put on the market.

We can easily think that validating an artificial intelligence triage solution for radiology is an easy path: Yes/No.  There is a single pathology: it doesn’t require a lot of work!

However, it is a step with heavy responsibility. Whether it is to comfort us, as software developers, or to obtain certifications, such as FDA clearance or CE mark. Therefore, it’s an important regulatory requirement.

The goal is to prove to the medical community, and to the competent authorities, that our product is accurate, reliable, performant (perform as intended), fast, and above all: carries no risk for the patient!

This proof is made through statistical results, rigorously chosen, reliable, robust, and appropriate. Which, we obtain through validation studies by testing our algorithms on a sample of real-world medical images from the clinic. Furthermore, this sample is supposed to represent the targeted population. Of course, before we start the actual validation process, we need to know our MD at our fingertips – to get the facts.

3 important preliminary steps must be carried out:

  • Carry out bibliographical research:

This is a very long and tedious part, which aims to define the state of the art relative to the device we want to validate. The points that will be addressed are:

Search and define similar/predicate devices:

a) Know their limitations/dysfunctions (to exclude them from our products). Or, if we observe the same limitations in our products during the validation phase, this will give us something to discuss.

b) Understand statistical performances of the competitors: to set a benchmark for the performance and effectiveness we wish to achieve with our device or even surpass them.

– Appreciate the incidence, prevalence, target, and at-risk populations (age, gender). In short, to know the epidemiology of the pathology targeted by our MD.

– This library will allow us to set up a solid validation protocol adapted to the MD to be validated by choosing the right methodologies and adequate and appropriate statistical methods.

  • Write the validation protocol (the Study Summary) according to the point quoted just before.

  • Constitute the database:

This step is also very important, and sometimes the longest. As we know, we will not be able to constitute a database representative of the whole targeted population. Nevertheless, given the bibliography we have made, we can target some characteristics to get as close as possible to it.

Indeed, if we know that 90% of the pathology we are targeting affects an Asian male population of 70 years old and over, it’s clear that we are not going to look for data on Caucasian women under the age of 30. It seems obvious, but way less when we are under pressure to finalize the validation.

Another important point is to collect clinical data acquired in a large number of clinical sites (multi-center data), to cover as many as possible different populations and acquisition protocols.

Also, we need to set the performance we want to achieve. This can be the result of the bibliographic research we have done on similar devices. So, we can already decide not to perform less well than the competitor. Common sense will also push us to say that we do not want, for example, sensitivity and specificity lower than 90%. Lastly, this is also an FDA request for sorting MDs.

These requirements will allow us to calculate the minimum sample size of our database: sample size calculation. This size is indicative only, it is a minimum value. It will inevitably increase according to the criteria we wish to explore during the validation.

The performance is first calculated on the totality of the data, but also stratified ones:

1) Scanner makes:

We will try to obtain data from all the major scanner manufacturers. For each scanner, we want to have as many models as possible, with all the detectors rows available, etc.

Therefore, the more subgroups to be explored (validated), the sample size for each group will have to be increased to have significant statistics.

Hence, the minimum sample size that we have calculated can increase quickly.

2) In addition, we will explore the acquisition protocol recommended in current clinical practice:

We will collect the data trying to get as close as possible to it. Such as aiming at the recommended slice thickness, radiation doses (kVp, mAs, etc.), and more.

3) We NEED to have equivalent portions in terms of sex (depending on the pathology and target population): ~50% for each group

4) Age too.

Always refer to the target population. Of course, we can have small samples in young populations (<20 years old) as long as we can justify that the targeted pathology does not affect this population.

5) …. And more parameters.

6) Also, this is where it gets more complex:

If you are aiming for an FDA clearance, you should know that at least 50% of the data must be in the US. This could be not an easy task, especially if you want to apply IN ADDITION to all the selection criteria described just before.

7) Finally, the location of the pathology must be taken into account.

For example, the large vessel occlusion in the anterior circulation affects the M1 MCA, proximal M2 MCA, distal M2 MCA, and ICA. So, the data with positive LVO must be equally distributed in these groups, for a stratification that holds the road. BUT, as we have done our homework (i.e bibliography), we know that the LVO affects the MCA more than the ICA, so we can justify a smaller sample for the latter category.

Here, I have only spoken of the preliminary parts.

Once, all these steps are done we start the actual validation.

The main difficulty of this part is to find radiologists specialized by type of pathology to carry out the validation. Whether it is a quantitative or qualitative validation or just to establish the ground truth.

For the FDA, the operators must be US board-certified radiologists (for our indication of use). In addition, to establish the ground truth, a minimum of 3 physicians must be involved.

The validation process can be long, as it depends on the availability of the physicians, and the huge amount of clinical data they will assess.

When the statistical results are obtained, we validate (or not) the performance of our medical device.

If the statistics are not good (lower than the limit we set beforehand); if processing errors appear during the validation; if bugs, display, or calculation problems are found; then, the validation stops, and the software goes back into development!

Who still says that validating a triage software is easy?

At Avicenna.AI we are fortunate to have an experienced team, having already carried out such validations, in a changing and increasingly demanding environment.

This has allowed us to quickly put in place rigorous validation studies, required to obtain the FDA for our LVO (large vessel occlusion) and ICH (intracranial hemorrhage) triage software.

This expertise is also used for the construction/training of our algorithms. And, has allowed us to obtain applications with high performance and effectiveness, robust for safe and accurate use. which is for us the key to the acceptance of AI algorithms in clinical routine.

Do you want to know how we validate our products? Access our youtube channel and watch Angela Ayobi, our Clinical Affairs Engineer goes through every step of the validation process.

Why triage for Avicenna.AI?

Peter Chang Interview

We all recognize that artificial intelligence plays an increasingly prominent role in the medical field today. Each day more and more software applications are being developed to support physicians and facilitate their work. So what makes Avicenna stand out from the rest? 

To begin, a key focus of the Avicenna portfolio today lies in a specific area of healthcare focused on emergency triage. This strategic vision is motivated largely by understanding the intersection of what AI technology is capable of doing today and what physicians are most likely to find useful. By carefully designing applications that enhance the physician workflow, we are acknowledging that at least the in immediate future, AI will play a very focused role in augmenting patient care without replacing autonomous human decision making. 

Having made this observation, one natural and powerful role AI can play is to assist human physicians in the triage of patient care. In this framework, AI software is used to identify patients with positive, urgent findings that need to be addressed promptly. All findings and diagnoses made by the AI system are ultimately reviewed by a human physician before final treatment decisions are made. The synergy afforded by this system between the human physician and AI will ultimately lead to improved patient care and outcomes.

Our solution does not stop there. In the triage marketplace, there remains significant room for improvement. One of the key features missing in all currently available commercial solutions is the ability to perform a simple comparison between a historic examination, that is, the ability to not just detect hemorrhage, but to characterize whether the hemorrhage has gotten bigger or smaller. In most academic practices, the vast majority of all exams are follow-up studies and so the presence of a finding is usually not innately an emergency on its own; instead; it is the presence or absence of change that physicians care most about. 

Based on this discussion, the natural next evolution of AI software is a type of algorithm whereby humans no longer need to participate in the interpretation process. This new type of application is certainly in my opinion the next most important, most exciting opportunity for AI. The idea here is to develop very sensitive algorithms, capable of detecting any anomaly present on the screen. Such a tool, if calibrated properly, may be used to screen all patients before human interpretation; depending on algorithm confidence, one may imagine that a subset of normal patients may not need to be evaluated at all by a human physician. Freeing the attention of busy physicians to now focus on those select few patients that need it the most, this system may realize what the economist Jean Fourastié once claimed as “the machine that leads man to specialize in the human”.

Dr. Peter Chang – Director at UCI Center for AI in Diagnostic Medicine

Sur le marché de la radiologie, Avicenna.Ai se place en challenger

CT Scan on old man

Sur un marché de plus en plus concurrentiel, Avicenna.AI a choisi de prendre son temps. Ce marché, c’est celui de la radiologie intelligente. Celle qui permet de guider le regard du médecin, de l’aider à détecter des pathologies qui lui auraient échappé à l’œil nu. “Le marché est occupé par des acteurs trop gros pour développer des outils comme le nôtre“, observe Olivier Fuseri, chargé d’affaires de l’entreprise. “Il y a aussi de plus en plus de startups qui organisent d’importantes levées de fonds”.  Face à ces acteurs, Avicenna.Ai a choisi de se placer en challenger. “Nous sommes restés cachés le plus longtemps possible”, le temps de perfectionner sa solution et d’atteindre des performances élevées. Ainsi, alors que la plupart des solutions d’imagerie intelligente indiquent au radiologue le moindre élément anormal – “ce qui lui fait perdre du temps” – le logiciel développé par Avicenna.Ai ne veut l’alerter qu’en cas d’urgence, “avec très peu de faux positifs“.

Car l’idée est bien d‘offrir à un gain de temps à des radiologues souvent en sous-effectifs, mais aussi d’améliorer la prise en charge du patient.

Pour l’heure, l’entreprise a développé trois produits capables de détecter, à partir des scanners du malade, des hémorragies intracérébrales et l’occlusion de gros vaisseaux. Des pathologies pour lesquelles chaque seconde compte. “Chaque seconde, des neurones sont perdus et ne seront jamais régénérés“. D’où l’intérêt d’une prise en charge rapide. “Cela permettra un temps de récupération drastiquement réduit avant de reprendre une vie normale”. Et qui dit meilleure récupération dit moins de coûts pour la collectivité et le système de santé. “C’est moins d’interruptions de travail, moins de complications à soigner…”

En attente d’une certification de la FDA américaine

Après le temps du développement, place à celui du marquage, plus chronophage. Deux produits ont déjà obtenu la certification européenne, ils attendent son équivalent américain, le marquage par la Food and drug administration (ou FDA). “Les États-Unis sont notre principale cible. Lorsque je discute avec de grosses entreprises internationales, elles m’expliquent qu’elles n’investissent pas dans des entreprises françaises qui ne sont pas capables de travailler pour la FDA”. Un Graal pour lequel il faudra patienter encore quelques mois avant la commercialisation prévue au second semestre de cette année.

Et le marché est immense. “Tous les centres, y compris les plus petits, ont des scanners” et sont donc des clients potentiels. Mais pour s’adresser à eux, l’entreprise fait le choix de passer par des distributeurs. “Notre but est de travailler au travers de partenaires formés qui connaissent bien les produits“. Il pourra s’agir de marketplaces ou bien de fournisseurs d’outils en radiologie. “Ceux-ci proposent avant tout du matériel et ne font pas de traitement d’images. Leurs clients leur demandent à ce que cela soit ajouté à leurs outils”. Avicenna.Ai vise aussi les fabricants d’IRM et de scanners qui pourraient eux aussi intégrer le logiciel à leurs produits.

Grâce à ces partenaires, l’entreprise espère voir ses produits présents dans dix à vingt centres de santé d’ici la fin d’année. “Ce sera une année d’incubation pour voir comment le marché se présente“. A terme, l’ambition est de devenir “un des leaders sur l’imagerie d’urgence mais aussi sur d’autres sujets”. L’oncologie pourrait être l’un d’eux. “Cela dépendra des opportunités de recherche qui se présenteront“.

Lisez l’article de presse en intégralité.