POSTED BY: Stelios Tsampas / 11.01.2016

GDCM buffer overflow in ImageRegionReader :: ReadIntoBuffer

CENSUS ID:CENSUS-2016-0001
CVE ID:CVE-2015-8396
Affected Products:Applications using GDCM versions < 2.6.2 and the ImageRegionReader :: ReadIntoBuffer API call
Class:Integer Overflow or Wraparound (CWE-190)
Discovered by:Stelios Tsampas

Grassroots DICOM (GDCM) is a C++ library for processing DICOM medical images. It provides routines to view and manipulate a wide range of image formats and can be accessed through many popular programming languages like Python, C#, Java and PHP. Various applications that make use of GDCM are listed here and here.

GDCM versions 2.6.0 and 2.6.1 (and possibly previous versions) are prone to an integer overflow vulnerability which leads to a buffer overflow and potentially to remote code execution. The vulnerability is triggered by the exposed function gdcm::ImageRegionReader::ReadIntoBuffer, which copies DICOM image data to a buffer. ReadIntoBuffer checks whether the supplied buffer is long enough to accomodate the necessary data, however in this check it fails to detect the occurrence of an integer overflow, which leads to a buffer overflow later on in the code. The buffer overflow will occur regardless of the size of the buffer supplied to the ReadIntoBuffer call.

Details

The integer overflow takes place when the application attempts to calculate the DICOM image size. In file gdcmBoxRegion.cxx, line 85 we have:


size_t BoxRegion::Area() const
{
  return (Internals->YMax — Internals->YMin + 1)*
         (Internals->XMax — Internals->XMin + 1)*
         (Internals->ZMax — Internals->ZMin + 1);
}

The above variables represent the dimensions of the DICOM image region that is to be copied, and are set via a call to gdcm::ImageRegionReader::SetRegion(). In the most common case the region covers the entire image and is therefore controlled by the input file's DICOM headers, where the image dimensions are specified. Specially crafted dimensions can cause the multiplication to wrap around zero, thus making the return value smaller than the real size requirements.

The return value is eventually saved in variable 'thelen' and then used in the buffer length check of gdcmImageRegionReader.cxx, line 445:


bool ImageRegionReader::ReadIntoBuffer(char *buffer, size_t buflen)
{
  size_t thelen = ComputeBufferLength();
  if( buflen < thelen )
    {
    gdcmDebugMacro( "buffer cannot be smaller than computed buffer length" );
    return false;
    }

  assert( Internals->GetFileOffset() != std::streampos(-1) );
  gdcmDebugMacro( "Using FileOffset: " << Internals->GetFileOffset() );
  std::istream* theStream = GetStreamPtr();
  theStream->seekg( Internals->GetFileOffset() );

  bool success = false;
  if( !success ) success = ReadRAWIntoBuffer(buffer, buflen);
  if( !success ) success = ReadRLEIntoBuffer(buffer, buflen);
  if( !success ) success = ReadJPEGIntoBuffer(buffer, buflen);
  if( !success ) success = ReadJPEGLSIntoBuffer(buffer, buflen);
  if( !success ) success = ReadJPEG2000IntoBuffer(buffer, buflen);

  return success;
}

As long as the length check is passed, all of the decoding functions (ReadRAWIntoBuffer, etc.) will assume that the input buffer is long enough so they will copy the image data into the buffer without further checks.

The image copy operations are executed by a number of memcpy() calls, such as the following one from gdcmJPEGLSCodec.cxx, line 514:


  memcpy(&(buffer[((z-zmin)*rowsize*colsize + (y-ymin)*rowsize)*bytesPerPixel]),
         tmpBuffer1, rowsize*bytesPerPixel);

An adversary can supply a specially crafted DICOM image file where the dimensions are such that:

  • the above discussed image size check will be bypassed through the integer overflow
  • the number of bytes copied during memcpy() (i.e. the above rowsize * bytesPerPixel argument) will not be subject to an integer overflow and will be large enough to overflow the memcpy() destination buffer
This scenario would allow an attacker to overflow the target buffer with attacker-controlled data (i.e. image data) possibly leading, under certain conditions, to (remote) code execution.

The buffer overflow may occur regardless of the size of the buffer allocated, just as if ImageRegionReader::ReadIntoBuffer contained no buffer length checks.

If a vulnerable version of the library must be used, there are proactive actions that can be taken to prevent the effects of the buffer overflow, such as detecting the dimensions-based integer overflow prior to calling the vulnerable API call.

Exploitation Notes

To further analyze the risk of this vulnerability we developed a proof-of-concept exploit following the strategy described below.

In file gdcmImageRegionReader.cxx, line 458 we see that the application supports a number of image codecs:


bool success = false;
if( !success ) success = ReadRAWIntoBuffer(buffer, buflen);
if( !success ) success = ReadRLEIntoBuffer(buffer, buflen);
if( !success ) success = ReadJPEGIntoBuffer(buffer, buflen);
if( !success ) success = ReadJPEGLSIntoBuffer(buffer, buflen);
if( !success ) success = ReadJPEG2000IntoBuffer(buffer, buflen);

Manipulating data in DICOM headers of any of the above image types will lead to a buffer overflow, but as it turns out only a few of them would allow us to avoid a segmentation fault (due to the large number of bytes that will need to be copied). JPEG-LS proved to be a good choice in that regard.

Eventually the program will reach gdcm::JPEGLSCodec::DecodeExtent() in gdcmJPEGLSCodec.cxx:


bool JPEGLSCodec::DecodeExtent(
   char *buffer,
    unsigned int xmin, unsigned int xmax,
    unsigned int ymin, unsigned int ymax,
    unsigned int zmin, unsigned int zmax,
    std::istream & is
  )
[...]
  else if ( NumberOfDimensions == 3 )
[...]
    for( unsigned int z = zmin; z <= zmax; ++z )
[...]
      std::vector  outv;
      bool b = DecodeByStreamsCommon(dummy_buffer, buf_size, outv);
      delete[] dummy_buffer;

      if( !b ) return false;

      unsigned char *raw = &outv[0];
      const unsigned int rowsize = xmax — xmin + 1;
      const unsigned int colsize = ymax — ymin + 1;
      const unsigned int bytesPerPixel = pf.GetPixelSize();

      const unsigned char *tmpBuffer1 = raw;
      for (unsigned int y = ymin; y <= ymax; ++y)
        {
        size_t theOffset = 0 + (0*dimensions[1]*dimensions[0] + y*dimensions[0] +
               xmin)*bytesPerPixel;
        tmpBuffer1 = raw + theOffset;
        memcpy(&(buffer[((z-zmin)*rowsize*colsize +
              (y-ymin)*rowsize)*bytesPerPixel]),
          tmpBuffer1, rowsize*bytesPerPixel);
        }
[...]
}

This function goes through each JPEG-LS frame in the DICOM file by looping from 'zmin' to 'zmax' (our file is multi-frame, meaning basically 3-dimensional), it decodes the frame by calling DecodeByStreamsCommon(), then copies each frame to our small buffer by looping through 'ymin' and 'ymax' and calling memcpy() for each "row". This will cause the buffer to overflow at some point.

Our goal is for the buffer to overflow by one of the memcpy() calls without causing a segmentation fault and then for the code to immediately exit the loop, so as to to avoid a segmentation fault caused by a further call to memcpy().

We note that the function will return if the return value of DecodeByStreamsCommon() is false. As it turns out, it is not hard to arrange that:


bool JPEGLSCodec::DecodeByStreamsCommon(char *buffer,
       size_t totalLen, std::vector &rgbyteOut)
{
  const BYTE* pbyteCompressed = (const BYTE*)buffer;
  size_t cbyteCompressed = totalLen;

  JlsParameters params = {};
  if(JpegLsReadHeader(pbyteCompressed, cbyteCompressed, &params) != OK )
    {
    gdcmDebugMacro( "Could not parse JPEG-LS header" );
    return false;
    }
[...]
}

What the attacker has to do to stop the loop is to provide a malformed JPEG-LS header, right after the frame which is responsible for the overflow.

An input file that causes a crash is available here, and sample code triggering the bug is available here.

Discussion

Applications that use the ImageRegionReader::ReadIntoBuffer API call (from GDCM versions 2.6.1, 2.6.0 and possibly earlier versions) to process untrusted medical image data may allow attackers to cause memory corruption, denial of service or possibly (remote) code execution on the systems hosting these applications.

The GDCM project has released version 2.6.2 that addresses this issue. It is advised to upgrade all GDCM installations to the latest stable release.

Disclosure Timeline

CVE assignment:December 2nd, 2015
Vendor Contact:December 4th, 2015
Vendor Patch Release:December 23rd, 2015
Public Advisory:January 5th, 2016