100,000 张人类结直肠癌和健康组织的组织学图像100,000 histological images of human colorectal cancer and healthy tissue

The NCT-CRC-HE-100K dataset is a set of 100,000 non-overlapping image patches extracted from 86 H
E stained human cancer tissue slides and normal tissue from the NCT biobank (National Center for Tumor Diseases) and the UMM pathology archive (University Medical Center Mannheim). While the dataset Colorectal Cacner-Validation-Histology-7K (CRC-VAL-HE-7K) consist of 7180 images extracted from 50 patients with colorectal adenocarcinoma and were used to create a dataset that does not overlap with patients in the NCT-CRC-HE-100K dataset. It was created by pathologists by manually delineating tissue regions in whole slide images into the following nine tissue classes: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), colorectal adenocarcinoma epithelium (TUM).
NCT-CRC-HE-100K 数据集是一组 100,000 个不重叠的图像块,从 NCT 生物库(国家肿瘤疾病中心)和 UMM 病理学档案(曼海姆大学医学中心)的 86 个 H
E 染色人类癌症组织载玻片和正常组织中提取。而数据集 Colorectal Cacner-Validation-Histology-7K (CRC-VAL-HE-7K) 由从 50 名结直肠腺癌患者中提取的 7180 张图像组成,用于创建一个与 NCT-CRC-HE-100K 数据集中的患者不重叠的数据集。它是由病理学家通过将整个载玻片图像中的组织区域手动描绘为以下九个组织类别而创建的:脂肪 (ADI)、背景 (BACK)、碎片 (DEB)、淋巴细胞 (LYM)、粘液 (MUC)、平滑肌 (MUS)、正常结肠粘膜 (NORM)、癌症相关基质 (STR)、结直肠腺癌上皮 (TUM)。