“Artificial Intelligence” or simply “AI” is a broad term for algorithms that aim to simulate human intelligence by learning and making decisions. Generally, algorithms (“software programmes”) consists of a set of instructions to perform a specific task. However, some tasks, which may be easy for humans to do, are very difficult to write down as instructions for computers to complete; a classic example being deciding if an image shows a cat or a dog. This is where AI algorithms can be useful as, rather than writing step-by-step instructions for the task, these algorithms are “trained” by being given an input (such as a picture of a cat) and being told the correct output (“this is a picture of a cat”). For each training image the algorithm’s own output is compared with the correct output and small adjustments are made to its internal calculations to minimise the difference between the two. With a large amount of good quality training data these algorithms learn to complete a task well without ever being explicitly told how to do so.
In radiology, AI algorithms are trained in this manner to detect disease using thousands of medical images which a radiologist has marked as containing cancer or not. A new set of images, which weren’t used for training, can then be presented to test the algorithm’s ability to detect cancer. Our research will systematically test how well AI algorithms, which have been developed by different groups and companies, perform in the context of breast cancer screening.
Breast screening is conducted as part of a national programme in the UK. Women aged between 50 and 70 are invited for a mammogram (an x-ray of the breast) every three years and roughly 2.2 million women are screened each year. At present, the screening mammograms are read by two expert readers (radiologists and radiographers), and around 70-80% of the cancers are detected. The other 20-30% are either detected on the next screening mammogram three years later or beforehand if symptoms such as a lump are present. Screening and early detection of breast cancer is known to improve outcomes for the women affected however it is a labor-intensive process. It is believed that AI algorithms have the potential to work alongside expert readers to ease the workload in the breast screening programme and make it more effective by:
- Improving the accuracy of reading mammograms and cancer detection
- Acting as a one of the two readers to reduce the manpower requirements
- Making reading more efficient by marking and prioritising mammograms which it believes are more likely to contain cancer
- Identifying women with dense breast tissue, for whom mammography is less sensitive, to benefit from supplementary imaging methods such as ultrasound or MRI.
We are performing this work to improve breast cancer screening and, as part of this research, we plan to build a database containing mammograms from approximately 80,000 women to be representative of the UK population. The mammograms will be obtained from those performed as part of the Cambridge Breast Screening Programme and the database will form an essential resource for testing AI algorithms as well as being valuable to the creation and development of new AI algorithms. This is because both training and testing benefit from large numbers of varied, high quality images. We will initially collect the existing mammograms of women who attended screening between 2011 and 2020 and, during the course of our research, we will continue to collect new screening mammograms until 2022.
Firstly, a key database, of non-anonymised data, will be created and stored at the hospital (Addenbrookes) within the NHS firewall. The key database will contain information to help us identify each case that will be included in the final testing database. The identifiable information that is included is normally collected as part of routine care (date of birth, NHS number, hospital number and screening number). A new unique trial ID will also be created and added, to identify each case. The management of this database will follow the security and data management rules for confidential information within the health sector. Access to this database will only be allowed by specific members of the research team who have been granted permission for access. This database will not be used for testing and will remain at the hospital site.
Secondly, a final testing database containing only pseudonymised information will be created and stored at the university (Cambridge). Pseudonymised information includes the mammogram and the information relating to the image that is routinely documented as part of the screening programme; details of previous breast biopsies, treatment (e.g. surgery or radiotherapy), and genetic risk factors alongside the unique trial ID. Data will be pseudonymised according to national standards such that all the information that could be used to identify someone (name, address, date of birth, NHS number, etc.) will be removed. This pseudonymised database will be used for testing.
Many AI algorithms have been developed by universities and computer software experts following recent advances in machine learning, computer processing power, and the availability of data. Using our proposed final testing database of mammograms, the most advanced of these AI algorithms will be tested on data that is representative of the UK breast screening population. The testing will be split by application (e.g. cancer detection) and this will provide an analysis of the AI algorithms’ performance against the UK screening standards to determine which AI algorithms are appropriate for further real-time testing in the NHS system. Our work will also aim to highlight the performance and limitations of each AI algorithm, which is vital for clinicians to understand how to use them effectively in future real-world practice.
To be able to use AI algorithms within the NHS we require regular monitoring and data security infrastructure. This partnership already exists between the NHS and many commercial companies for other software solutions such as the electronic health record systems used to store patient notes. Therefore, commercial companies and NHS work collaboratively to regularly monitor and update systems ensuring both patient safety and patient data safety.
The pseudonymised final testing database for research will be stored at University of Cambridge. All our studies testing the AI algorithms and the analysis of the results will take place there as well. The database will be governed by a Database Access Committee who will oversee all uses of this database.
In certain circumstances, limited amounts of pseudonymised data will be released to academics and commercial companies but with strict restrictions on how data can be used. This could occur for purposes such as adapting an algorithm to function with the type of mammogram which will be used in testing. This will allow for more fair comparison between algorithms despite differences in the mammographic images they were trained on which can differ between regions and countries.
The national ethics boards, the authorities which approved this study, have concluded that consent from individual women is not required for their mammograms to be pseudonymised and included in the creation of an image database governed by strict information governance policies. Information posters will be displayed in screening centres and on screening vans that are taking part in this research to provide the research team’s contact details. Women who do not want their data used for this research can contact the team to discuss opting-out.
This research is part of a Cancer Research UK programme grant for ‘Risk adaptive breast cancer screening: a tailored imaging approach’.
- Full-Field Digital Mammography (FFDM)
- Artificial intelligence. More detail about the development of such algorithms can be found at EPSRC Centre for Mathematical Imaging in Healthcare (CMIH) https://www.cmih.maths.cam.ac.uk/projects/multi-task-transfer-learning-deep-convolutional-neural-network-automatic-classification-mammography/
- Doctor Nicholas Payne – Medical Physicist
- Richard Black – NHS
- Doctor Sarah Hickman – Clinical PhD student
Phone: 01223 348937
- Le EPV, Wang Y, Huang Y, Hickman S, Gilbert FJ. Artificial intelligence in breast imaging. Clin Radiol. 2019;74(5):357–66. (https://www.clinicalradiologyonline.net/article/S0009-9260(19)30116-3/fulltext)
- Gilbert FJ, Smye SW, Schönlieb CB. Artificial intelligence in clinical imaging: a health system approach. Clin Radiol. 2020;75(1):3–6. (https://clinicalradiologyonline.net/retrieve/pii/S0009926019305422)