This article was submitted to Paleontology, a section of the journal Frontiers in Earth Science
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Recently, deep learning has reached significant advancements in various image-related tasks, particularly in medical sciences. Deep neural networks have been used to facilitate diagnosing medical images generated from various observation techniques including CT (computed tomography) scans. As a non-destructive 3D imaging technique, CT scan has also been widely used in paleontological research, which provides the solid foundation for taxon identification, comparative anatomy, functional morphology, etc. However, the labeling and segmentation of CT images are often laborious, prone to error, and subject to researchers own judgements. It is essential to set a benchmark in CT imaging processing of fossils and reduce the time cost from manual processing. Since fossils from the same localities usually share similar sedimentary environments, we constructed a dataset comprising CT slices of protoceratopsian dinosaurs from the Gobi Desert, Mongolia. Here we tested the fossil segmentation performances of U-net, a classic deep neural network for image segmentation, and constructed a modified DeepLab v3+ network, which included MobileNet v1 as feature extractor and practiced an atrous convolutional method that can capture features from various scales. The results show that deep neural network can efficiently segment protoceratopsian dinosaur fossils, which can save significant time from current manual segmentation. But further test on a dataset generated by other vertebrate fossils, even from similar localities, is largely limited.
香京julia种子在线播放
Vertebrate paleontology is based on fossils that are remains of ancient organisms. Because fossils usually do not preserve molecular or behavior information, paleontologists mainly focus on their morphology, which not only includes exterior features but also interior structures like brain endocasts and inner ears. Traditionally, researchers used destructive thin sectioning to reveal interior structures, which totally destroys the fossils. With the application of non-destructive 3D imaging techniques, like CT (computed tomography) scan and synchrotron radiation scanning, paleontologists can observe and interact with previously hidden structures without making damages. CT scan and other non-destructive imaging techniques have greatly facilitated the development of vertebrate paleontology not only in revealing hidden structures but also providing 3D models for teaching and exhibition.
Recently, CT scans have been successfully applied to various branches in vertebrate paleontology (
Application of CT scans in vertebrate paleontology.
Current CT data processing is laborious, especially for morphologically complex objects like human bodies and vertebrate fossils. CT scanners differentiate volumes by their absorptions of X-ray radiation, which is primarily controlled by their densities. However, fossils and their surrounding matrices are often similar or even identical in densities (
Comparison between different segmentation methods in
Most paleontological CT data are processed in commercial software such as VGstudio by Volume Graphics and Mimics by Materialise. There is also a handful of opensource software (e.g., ImageJ developed by the National Institute of Health, United States) but most of them provide users with less functionality comparing to commercial ones. A lot of 3D image processing software offers thresholding and region growing functions for more efficient segmentation. Thresholding differentiates voxels according to their grey values, therefore it is functionless when ROI (region of interest) and surrounding areas are similar in density. Region growing requires a seed voxel to grow from in given directions following a grey value range, which means all connected voxels within a given grey value range will be labeled. Thresholding and region growing work best when there is significant contrast between ROI and surrounding areas, and their boundaries can be clearly defined. Although current software can process multiple slices at the same time based on linear interpolation, it is often insufficient for more rapid and accurate segmentation, especially of intricately preserved fossils. In CT data processing, if there are significant grey value differences (which represent densities) between target elements (bony fossils, stained soft tissue, and cavity structures) and undesired elements (i.e., rock matrices or surrounding tissues), it is easy to segment. But due to the complex taphonomy, fragmentary preservation, similarity between rocks and embedded fossils, CT data processing is usually exhausting. 1. Manual processing is necessary when ROI and surrounding areas are similar in densities or their boundaries are ambiguous. 2. Only linear interpolation (i.e., thresholding and region growing) are applied to track density and morphology changes. 3. There lacks a consistent standard and enough validation for annotation.
Over the last decade, deep learning has shown incredible potential when applied in complicated tasks including image processing. Since 2013, automated processing (classification, object detection, and segmentation) of medical images has utilized by deep learning ( 1. To reduce the harm to patients, medical scans are usually restricted to relatively low energy and short exposure time. Therefore, they cannot produce as much contrast as the higher energy beams used for fossils. Also, paleontological scan data usually have higher resolutions, thus requires greater computational capability. 2. Most fossils are deformed and fragmentary. Many image-related tasks in medical sciences focus on morphologically alike structures, for example hearts, livers, and lungs. 3. Paleontological scans generally have much small sample sizes in comparison to medical scans because of the uniqueness of fossil specimens.
In this study, we generate a dataset from the CT-scans of three well-preserved protoceratopsian skulls (Ornithischia, Dinosauria) from Gobi Desert, Mongolia, and manually annotated bone structures in each slice for subsequent analysis. We test the performance of classic U-net (
Deep neural network structures used in this study.
Three protoceratopsian skulls, in which two (IGM 100/3654 and IGM 100/3655) were collected during the 1992 AMNH-Mongolian Academy of Sciences expeditions in Tugrugeen Shireh, Mongolia and IGM 100/1021 was collected in Ukhaa Tolgod, Mongolia, were CT scanned at the Department of Earth & Planetary Sciences, Yale University, United States. Specimen IGM 100/3654 and IGM 100/3655 are two nearly complete and undeformed skulls identified as embryonic elements due to their smallest sizes (less than 4 cm in lateral length) among all discovered protoceratopsian dinosaurs, ossification status, and morphology differences from later developmental stages. They are smaller and less ossified than IGM 100/1021 (a known embryo), which is a dorsal-ventrally deformed skull but preserves most facial elements.
Scanning detail of three protoceratopsian specimens are shown in
Details of raw dataset.
Specimen number | Taxa | Dimension | Voxel size (μm) | Selected slices (training + testing) |
---|---|---|---|---|
IGM 100/3654 |
|
1228*1902*1042 | 21.43 | 2059 + 885 |
IGM 100/3655 |
|
1362*1731*1193 | 21.44 | 3047 + 1239 |
IGM 100/1021 | Protoceratopsia | 768*1784*1533 | 22.74 | 2880 + 1205 |
In this study, we first test the performance of classic U-net for image segmentation (
Performance of different models.
Models | Feature extractor | Skip connection | mean_dice | mean_iou |
---|---|---|---|---|
Deeplab v3+ | MobileNet_v1 | 1 | 0.738 | 0.612 |
Deeplab v3+ | MobileNet_v1 | 2 | 0.894 | 0.817 |
Deeplab v3+ (ASPP concatenated) | MobileNet_v1 | 2 | 0.864 | 0.777 |
Deeplab v3+ | ResNet_v2_50 | 2 | 0.864 | 0.773 |
Part of the segmentation results are shown in
Segmentation results of protoceratopsian dinosaurs by DeepLab v3+ with MobileNet v1 as feature extractor. In each subfigure, from left to right: original CT slice, groundtruth, and prediction by models.
The fast advancements in techniques not only enable unprecedented resolution in observation of fossil material, but also increase the cost in data processing. Currently paleontologists are spending days to weeks in segmenting fossil scans, the introduction of deep learning can reduce that time to minutes. Although the 3D renderings from automated segmented slices are not as meticulous as manual results (
Comparison of different 3D renderings. From left: raw reconstruction (after thresholding), manual segmentation, and deep learning segmentation (by Deeplab v3+MobileNet v1)
To further test the generalization performance of deep neural network, we use the trained best-performed Deeplab v3+ network to segment other vertebrate fossils from the Gobi Desert, Mongolia with similar sedimentary environments, which should reduce the bias from sampling to the minimum. The new dataset (see supplementary material) including CT-scanned fossil slices of ornithischian dinosaur
Segmentation results of non-protoceratopsian dinosaurs by DeepLab v3+ with MobileNet v1 as feature extractor.
Deep learning has already shown its power in many research branches such as Go game playing, medical imaging processing, and earth science studies (
The datasets presented in this study can be found in online repositories at
CY design the project and prepare the datasets, CY, FQ, YL and ZQ proceeded the analysis. Everyone put efforts in preparing the manuscript.
CY and MN are funded by the Newt and Calista Gingrich Endowment.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
We thank the 1992 Mongolia-American Museum of Natural History Expedition crews for collecting those