Why Should “Heap Inspection” Not Be Marked As False Positive?

Cenk Kalpakoğlu09 Nov 2018
AppSecSecure Coding

Sensitive Data Exposure has been ranked no 3 in OWASP Top 10 Application Security Risks in 2017 which leaves no room for doubt on the importance of storing data properly in your applications.

As we have recently started to see more Heap Inspection vulnerabilities in applications, we have also noticed a pattern of marking “Heap Inspection” as false-positive among our customers.

When enquired about the cause of this decision, the common justification has been the following: “ The project is being served in a safe environment and Garbage Collector (GC) is supposed to “free memory” after sensitive data has been used”.

Since this approach is technically not correct, we felt the need of explaining why attention must be paid to Privacy Violation vulnerabilities, focusing especially on Heap Inspection.

“Right off the bat, the answer to why you should pay attention to Sensitive Data Exposure vulnerabilities is HeartBleed.”

Heartbleed is a security bug in the OpenSSL cryptography library which allows stealing information from the application such as web, email, instant messaging(IM) and some virtual private networks(VPN).

If you are aware of Heartbleed or if you had a chance to test it you may have obtained web credentials or similar sensitive information from the server.

So, let’s go through an example of how a dating website leaks credentials because of an OpenSSL library-related bug.

To start off on the right foot, it is crucial that we understand what Garbage Collection is and why it is needed so that we can point out how a web application that has a garbage collection mechanism can still leak sensitive information

According to Wikipedia; Garbage collection (GC) is a form of automatic memory management.

The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program.

Now, let’s create a C program and observe memory.

To do this we will need an instrumentation(dynamic analysis) tool.

#1 Step: install valgrind
[cx@blog]$ apt-get install valgrind
#2 Step: create a C program (leak.c)
--------------------------------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
  char *str;
  /* memory allocation */
  str = (char *)malloc(10);
  if (str == NULL) {
    exit(EXIT_FAILURE);
  }
  strcpy(str, "kondukto");
  printf("String: %s, Address = %u\n", str, str);
  /* free resource */
  free(str);
  return 0;
}
--------------------------------------------------------------------------------------------
#3 Step: Compile the C program and run
[cx@blog]$ gcc leak.c -o leak
[cx@blog]$ ./leak
String: kondukto, Address = 2087920224

Nice and clean.

Now, we will run the same program via valgrind and valgrind will inspect memory and create a summary about what is happening in the memory.

#4 Step: Run valgrind and check the memory
[cx@blog]$ valgrind --leak-check=full -v ./leak
*snip*
==24737== HEAP SUMMARY:
==24737== in use at exit: 0 bytes in 0 blocks
==24737== total heap usage: 2 allocs, 2 frees, 1,034 bytes allocated
==24737==
==24737== All heap blocks were freed -- no leaks are possible

*snip*

Nice and clean.

Now, we will run the same program via valgrind and valgrind will inspect memory and create a summary about what is happening in the memory.

#4 Step: Run valgrind and check the memory
[cx@blog]$ valgrind --leak-check=full -v ./leak
*snip*
==24737== HEAP SUMMARY:
==24737== in use at exit: 0 bytes in 0 blocks
==24737== total heap usage: 2 allocs, 2 frees, 1,034 bytes allocated
==24737==
==24737== All heap blocks were freed -- no leaks are possible

*snip*

As you can see, everything is OK.

Now, we will create a memory leak and will observe the output again. Easy stuff, just change “free(str);” to “//free(str);” and compile && run via valgrind again.

#5 Step: Comment out “free(str)”;
---------------------------------------------------------------------------
/* free resource */
//free(str);
---------------------------------------------------------------------------
#6 Step: Compile the C program
[cx@blog]$ gcc leak.c -o leak
#7 Step: Run valgrind and check the memory
[cx@blog]$ valgrind --leak-check=full -v ./leak
*snip*
==24814== HEAP SUMMARY:
==24814== in use at exit: 10 bytes in 1 blocks
==24814== total heap usage: 2 allocs, 1 frees, 1,034 bytes allocated
==24737==
==24814== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1
==24814== at 0x483777F: malloc (vg_replace_malloc.c:299)
==24814== by 0x10906D: main (in /tmp/blog/leak)
==24814==
==24814== LEAK SUMMARY:
==24814== definitely lost: 10 bytes in 1 blocks
==24814== indirectly lost: 0 bytes in 0 blocks
==24814== possibly lost: 0 bytes in 0 blocks
==24814== still reachable: 0 bytes in 0 blocks
==24814== suppressed: 0 bytes in 0 blocks
==24814==
==24814== For counts of detected and suppressed errors, rerun with: -v
==24814== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
*snip*

Ooops, valgrind says that 10 bytes in 1 block is lost, exactly as expected.

As you may recall, we have allocated 10 bytes of memory ( str = (char *)malloc(10) ) and as we did not free our resource, a memory leak has occurred.

It should be noted that this is a very simplified simulation of GC and in reality, it works in a much more complicated way.

As a very complex memory management form, GC does more than merely “freeing” resources during a function epilogue, being even capable of deciding to hold data for reallocation or for performance purposes.

All high-level programming languages use GC to take the crucial and complex task of memory management off the plates of developers while they already have so much to think about.

However, this creates a wrong expectation on developers’ side, hoping that GC will also ensure proper storage of sensitive information.

The structure of GC, when combined with the over-reliance of developers on it as if it was a security solution, results in an application security risk where plain-text information could be exposed when a memory-related vulnerability like Heartbleed occurs.

Let’s test it and see how!

We will write and run a GOLANG program to simulate “authentication“. While it is running we will dump the memory and will look for passwords.

Since GOLANG is a garbage collected language, we will have a proper sample.

#1 Step: Write a GO program (gctest.go)
-----------------------------------------------------------------
package main
  import (
    "fmt"
    "time"
  )
  func login(passwd string) bool {
    time.Sleep(500 * time.Millisecond)
    if len(passwd) > 0 {
      return true
    }
    return false
  }
  func main() {
    for i := 0; i < 5; i++ {
      go func() {
        user := fmt.Sprintf("user-%d-password", i)
        if login(user) {
          fmt.Println("success")
        }
      }()
      time.Sleep(time.Second)
    }
  }
-----------------------------------------------------------------
#2 Step: Build it
[cx@cxpc blog]$ go build gctest.go
#3 Step Create a dump.sh file:
To dump the memory we will use a simple bash script which reads memory regions from
/proc/<pid>/maps and will use gdb to dump the content.
#4 Step Run go code and dump the memory:
[cx@blog]$ ./gctest # in one terminal
[cx@blog]$ sudo ./dump.sh `ps ax|grep gctest|awk '{print $1;exit}'` # other term
#5 Step investigate memory:
[cx@blog]$ grep "password*" *.dump
Binary file 8864-c41fff8000-c420100000.dump matches
[cx@blog]$ hexdump -C 8864-c41fff8000-c420100000.dump |grep -A5 password
*
000ae000 75 73 65 72 2d 31 73 0a 01 00 00 00 00 00 00 00 |user-1s.........|
000ae010 75 73 65 72 2d 33 2d 70 61 73 73 77 6f 72 64 00 |user-3-password.|
000ae020 75 73 65 72 2d 31 2d 70 61 73 73 77 6f 72 64 00 |user-1-password.|
000ae030 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000ae040 75 73 65 72 2d 33 2d 70 61 73 73 77 6f 72 64 00 |user-3-password.|
000ae050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*

As seen in the example, even though there is GC, some of our “sensitive information” has been exposed which means we should not rely on garbage collectors for security purposes.

The heart of the matter is, if you care about Privacy Violation vulnerabilities, then never store sensitive information in plain-text (even in the memory).

To find the correct way to store your data, you may go through the Secure Password Storage examples in your programming language’s built-in security libraries or check third-party libraries that support security standards (bcrypt, lcrypt, PBKDF2…). Cheers!

Get A Demo