File Sizes & Distribution

This page reviews files size examples based on number of cancer sites requested.

Data will be sent to investigators on thumb drives.

The data files will be encrypted using SecureDoc and password-protected. The data files will also be compressed using the GZIP compression utility. Programs such as 7-Zip and WinZip are available to unzip the compressed files onto the user's PC in the directory that the user specifies. The PC must be equipped with the Windows Operating system. GUNZIP is necessary to unzip the files if using a UNIX or Linux machine.

The following table provides some examples from recent productions. The files are compressed before they are written to the thumb drives, however, the file sizes provided are for the files in their non-compressed format. The table shows the estimated size of files in gigabytes for one major cancer site (colorectal) and for four major cancer sites combined (breast, colorectal, lung and prostate). Also shown are estimates for the non-SEER cohort database which includes patients not in SEER file, answered "NO" to any cancer question and resided in SEER area at time of survey.

File 1 Major Cancer Site
Size (GB)
4 Major Cancer Sites
Size (GB)
5% Non-SEER Database
Size (GB)
SEER 0.50 0.18
MBSF AB/ABCD 0.27 1.45 8.32
CAHPS Survey 0.03 0.72 4.03
MEDPAR 99-19 0.50 2.16 10.01
HHA 99-19 0.40 1.66 9.07
Hospice 99-19 0.30 1.30 5.15
DME 99-19 0.74 3.47 17.16
NCH
NCH 1999 0.30 1.41 7.16
NCH 2000 0.36 1.71 8.14
NCH 2001 0.45 2.16 9.78
NCH 2002 0.56 2.69 11.74
NCH 2003 0.65 3.09 13.38
NCH 2004 0.72 3.46 14.84
NCH 2005 0.78 3.83 16.28
NCH 2006 0.80 3.99 16.62
NCH 2007 0.81 4.07 17.02
NCH 2008 0.82 4.20 17.56
NCH 2009 0.83 4.35 18.17
NCH 2010 0.85 4.46 18.71
NCH 2011 0.83 4.45 18.89
NCH 2012 0.82 4.49 19.38
NCH 2013 0.80 4.48 19.67
NCH 2014 0.77 4.43 19.71
NCH 2015 0.75 4.34 19.70
NCH 2016 0.70 4.12 18.93
NCH 2017 0.62 3.79 18.00
NCH 2018 0.52 3.38 17.58
NCH 2019 0.45 3.04 16.86
Outpatient
Outpatient 1999 0.11 0.52 2.73
Outpatient 2000 0.14 0.66 3.22
Outpatient 2001 0.17 0.86 3.85
Outpatient 2002 0.22 1.09 4.75
Outpatient 2003 0.25 1.27 5.49
Outpatient 2004 0.29 1.45 6.19
Outpatient 2005 0.31 1.59 6.74
Outpatient 2006 0.32 1.62 6.81
Outpatient 2007 0.33 1.68 7.06
Outpatient 2008 0.34 1.72 7.35
Outpatient 2009 0.35 1.79 7.59
Outpatient 2010 0.36 1.86 7.85
Outpatient 2011 0.37 1.93 8.21
Outpatient 2012 0.37 1.97 8.38
Outpatient 2013 0.38 2.01 8.48
Outpatient 2014 0.36 2.07 8.52
Outpatient 2015 0.37 2.02 8.64
Outpatient 2016 0.36 2.02 8.75
Outpatient 2017 0.33 1.90 8.32
Outpatient 2018 0.28 1.69 8.25
Outpatient 2019 0.24 1.51 8.03
Part
Part D Event 2007 0.28 1.39 7.63
Part D Event 2008 0.30 1.53 8.48
Part D Event 2009 0.31 1.59 8.95
Part D Event 2010 0.32 1.66 9.52
Part D Event 2011 0.32 1.68 9.81
Part D Event 2012 0.33 1.80 10.74
Part D Event 2013 0.34 1.89 11.49
Part D Event 2014 0.33 1.83 11.51
Part D Event 2015 0.31 1.77 11.44
Part D Event 2016 0.29 1.70 11.35
Part D Event 2017 0.27 1.58 11.02
Part D Event 2018 0.24 1.45 10.59
Part D Event 2019 0.21 1.32 9.98
CC Summary 99-19 0.20 1.07 5.92
Other CC 00-19 0.25 1.34 7.49
MDS 99-18 0.26 1.07 6.17
OASIS 99-18 0.39 1.62 8.13
Part D MTM 13-17 0.00 0.01 3.14
Last Updated: 07 Jul, 2021