Skip to Main Content

DevOps, CI/CD and Automation

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Using Silicon Secured Memory to Detect Memory Access Errors at Hardware Speeds

steph-choyer-OracleFeb 25 2016 — edited Dec 15 2016

by Raj Prakash

Hardware support provided by Oracle's SPARC M7 processor can detect a wide range of memory access errors at hardware speed.

Introduction

In a previous article, we discussed the types and the severity of memory access errors. We also covered the tools available in Oracle Solaris Studio for detecting this class of error.

In this article, we discuss the limitations of performing this kind of error checking in software. We then describe the hardware support provided by Oracle's SPARC M7 processor for detecting memory access errors, and we point out the benefits of hardware analysis over software analysis.

Limitations of Dynamic Instrumentation

The most obvious limitation of dynamic instrumentation is that the instrumented program runs much slower than the original program. This is because the instrumented program needs to keep a database of each allocated chunk of memory. Each memory read and write instruction has to be augmented with other instructions to read the information from the database to find out whether the memory location is valid. Instrumented programs typically run more than 30x slower.

This slowdown often makes it impractical to do dynamic checking of a program with a large test suite. Consequently, dynamic checking is mostly used for debugging a specific problem or with a smaller test suite.

There are also some limitations on the errors that dynamic instrumentation can detect. The ability to detect memory errors relies on being able to distinguish between valid memory accesses and invalid memory accesses. Some kinds of memory access are clearly an error, for example, a read of uninitialized memory is unambiguously a problem. Other situations are not so clear cut.

As an example, consider an application that allocates a structure, uses that structure, and then frees it. If a pointer to this region of memory is used, then the tool can detect that it is an access to invalid memory, and can report it as a "freed memory" access error. This situation is known as a dangling pointer—a pointer that points to a region of memory that is no longer valid.

However, if later malloc() reuses the same region of memory, as shown in Listing 1, the memory is considered valid again. Now a memory access through the stale pointer is indistinguishable from a memory access through a legitimate pointer to the region. So it is not possible for a tool to report an error when the stale pointer is used.

int *area1 = malloc(64);
free(area1);
char *area2 = malloc(64); // area2 gets the memory area just freed by area1 area1[0] = 0;  // Stale-Pointer Access

Listing 1.

f1.png

Figure 1.

There is a similar situation in which a pointer gets corrupted. If the corrupted pointer happens to point to a valid region of memory, it's not possible for a tool to determine that this is a corrupted, rather than a legitimate, pointer.

Hardware Support for Detecting Memory Access Errors

Oracle's SPARC M7 processor provides a hardware feature called Silicon Secured Memory (SSM), previously known as Application Data Integrity (ADI) and sometimes still referred to as such. This hardware feature allows real-time detection of memory access errors.

Data is stored in memory in units of 64 bytes called cache lines. So when data that consists of one or more bytes is loaded from memory, the entire block of 64 bytes containing that data is fetched. The latest SPARC processors extend this by adding four additional bits to each cache line. Fetching the 64-byte cache line also fetches these additional bits. These four bits are invisible to the application and are used to hold additional information for SSM.

The best way of thinking of the bits is to imagine them containing a color. For example, a value of one could be thought of as red, a value of two as green, and so on. So a cache line of 64 bytes can be thought of as both containing 64 bytes of data and having a color.

Whenever we need to access a memory location, we need to have a pointer to that memory location. Pointers are 64 bits in size, which allows a 64-bit processor to potentially access 16 exbibytes (EiB) of data—which is about 17,000,000 TB of data. There are no current systems that can hold this much memory. For example, Oracle's SPARC M7-32 system can contain a staggering 64 TB of memory. Consequently a 64-bit processor does not need to use all the 64 bits in a pointer. Normally the unused bits are constrained to be all zeros or all ones, but SSM uses them for a different purpose.

Instead of requiring the most-significant four bits to be all zeros or all ones, SSM uses them to store color values. This means that all the pointers can be thought of as being colored in the same way as all the cache lines in memory are colored.

SSM uses the fact that we can color both pointer and memory to check for invalid memory accesses. A "green" cache line can be accessed only through a "green" pointer. It is an error to use a "green" pointer to access a "red" cache line. The hardware will cause a trap when such a color mismatch occurs.

f2.png

Figure 2.

Advantages of Hardware Support

The most obvious advantage of hardware support for memory error detection is the massive performance advantage. The hardware takes responsibility for checking that every memory access is valid, and this usually incurs a cost only if the access is invalid and the hardware has to cause a trap to report the error. Consequently most applications run at close to their usual speeds.

f3.png

Figure 3.

Another important advantage is that the software changes needed to support SSM can be provided in a library. An application does not need to have any instrumentation added in order for it to be checked. This means even existing applications for which the source code has been lost can be checked for correctness. For example, if the application is run with command a.out, the following will enable SSM to check the application:

% LD_PRELOAD_64=<compiler>/lib/compilers/sparcv9/libdiscoverADI.so a.out

However, there is another advantage to SSM that is not immediately apparent. SSM can pick up a range of errors that normal instrumentation cannot identify. Earlier, we discussed that typical software instrumentation has a problem when there are stale pointers to reassigned memory locations or if a pointer happens to point to a valid memory location. It is very hard for software to be able to handle these situations, because it has no idea whether the memory location is a valid location for the pointer to address. SSM, on the other hand, encodes the "color" of a memory location into both the pointer and the memory location. So only a "red" pointer can address a "red" memory location.

Let's consider how this changes the stale-pointer situation. When a memory location is freed, the call to free() can change the color of that block of memory. So now a stale pointer to that block of memory will be the wrong color to access it. Using this approach, we can detect an access to that block through a freed pointer.

f4.png

Figure 4.

Now imagine that the block of memory is returned for new use by another call to malloc(). In this situation, we can change the color of the block again. An access through the stale pointer continues to report an error by trapping.

f5.png

Figure 5.

Listing 2 shows the example in C.

int *area1 = malloc(64);
free(area1);
char *area2 = malloc(64); // area2 gets the memory area just freed by area1 area1[0] = 0;  // Stale Pointer Access

Listing 2.

The other situation is when a pointer ends up with random data that is dereferenced and happens to point to a valid block of data. In this case, the pointer is likely to be of the wrong color, so the error will be detected.

All these situations are very hard to detect in software, but they are caught with hardware support. So hardware support is not only faster and easier to apply to an existing application, but it also identifies an additional range of problems.

Listing 3 is an example program that shows the hardware detecting four types of errors: buffer overflow, freed-memory access, stale-pointer access, and freeing memory more than once.

  1  #include <stdlib.h>
  2  #include <stdio.h>
  3  int main() {
  4    int *area1 = malloc(sizeof(int)*16);
  5    int *area2 = malloc(sizeof(int)*100);
  6  
  7    for (int i = 0; i <= 16; i++)
  8      area1[i] = 0;     // Array Out of Bounds
  9  
 10    free(area1);
 11    area1[0] = 0;       // Freed Memory Access
 12    
 13    char *area3 = malloc(sizeof(char)*64);
 14    if ((void *)area1 == (void *)area3)
 15     printf("New area3 is same as old area1\n");
 16    area1[0] = 0;       // Stale Pointer Access
 17  
 18    free(area3);
 19    free(area3);
 20
 21    return 0;
 22  }

Listing 3.

As shown in Listing 4, the Listing 3 program can be run using SSM and the Oracle Solaris Studio discover tool by passing the flag -i adi to discover:

$ cc t.c -g -m64
$ discover -i adi -w - a.out
$ a.out

Listing 4.

The first problem that SSM detects is the buffer overflow on line 8 of Listing 3, where the array area1 is accessed with index 16 while the highest valid index is 15.

f6.png

Figure 6.

As shown in Listing 5, discover indicates the point in the code where the access occurs, plus the location in the code where the buffer is allocated.

ERROR 1 (ABW): writing to memory beyond array bounds at address 0x200000021047e040:

    main() + 0x38  \<t.c:8>

             5:      int \*area2 = malloc(sizeof(int)\*100);

             6:   

             7:      for (int i = 0; i \<= 16; i++)

             8:=>      area1\[i\] = 0;     // Array Out of Bounds

             9:   

            10:      free(area1);

            11:      area1\[0\] = 0;       // Freed Memory Access

    \_start() + 0x108

was allocated at (64 bytes):

    main() + 0x8  \<t.c:4>

            1:    [#include](https://forums.oracle.com/ords/apexds/domain/dev-community?tags=include) \<stdlib.h>

            2:    [#include](https://forums.oracle.com/ords/apexds/domain/dev-community?tags=include) \<stdio.h>

            3:    int main() {

            4:=>    int \*area1 = malloc(sizeof(int)\*16);

            5:      int \*area2 = malloc(sizeof(int)\*100);

            6:   

            7:      for (int i = 0; i \<= 16; i++)

    \_start() + 0x108

Listing 5.

As shown in Listing 6, the next problem detected is the write to freed memory at line 11 of Listing 3; the memory was freed earlier at line 10.

ERROR 2 (FMW): writing to freed memory at address 0x200000021047e000:

    main() + 0x6c  \<t.c:11>

             8:        area1\[i\] = 0;     // Array Out of Bounds

             9:   

            10:      free(area1);

            11:=>    area1\[0\] = 0;       // Freed Memory Access

            12:   

            13:      char \*area3 = malloc(sizeof(char)\*64);

            14:      if ((void \*)area1 == (void \*)area3)

    \_start() + 0x108

was allocated at (64 bytes):

    main() + 0x8  \<t.c:4>

            1:    [#include](https://forums.oracle.com/ords/apexds/domain/dev-community?tags=include) \<stdlib.h>

            2:    [#include](https://forums.oracle.com/ords/apexds/domain/dev-community?tags=include) \<stdio.h>

            3:    int main() {

            4:=>    int \*area1 = malloc(sizeof(int)\*16);

            5:      int \*area2 = malloc(sizeof(int)\*100);

            6:   

            7:      for (int i = 0; i \<= 16; i++)

    \_start() + 0x108

freed at:

    main() + 0x5c  \<t.c:10>

             7:      for (int i = 0; i \<= 16; i++)

             8:        area1\[i\] = 0;     // Array Out of Bounds

             9:   

            10:=>    free(area1);

            11:      area1\[0\] = 0;       // Freed Memory Access

            12:   

            13:      char \*area3 = malloc(sizeof(char)\*64);

    \_start() + 0x108

Listing 6.

There is a stale pointer access at line 16 of Listing 3. The memory pointed to by area1 was freed at line 10, but the memory was reused for area3 at line 13. As shown in Listing 7, discover reports this as a write to freed memory—even though the memory has been repurposed. This is an example of the kind of error that it is very hard for a software-only solution to detect.

ERROR 3 (FMW): writing to freed memory at address 0x200000021047e000:

    main() + 0xb4  \<t.c:16>

            13:      char \*area3 = malloc(sizeof(char)\*64);

            14:      if ((void \*)area1 == (void \*)area3)

            15:       printf("New area3 is same as old area1\\n");

            16:=>    area1\[0\] = 0;       // Stale pointer access

            17:   

            18:      free(area3);

            19:      free(area3);

    \_start() + 0x108

was allocated at (64 bytes):

    main() + 0x8  \<t.c:4>

            1:    [#include](https://forums.oracle.com/ords/apexds/domain/dev-community?tags=include) \<stdlib.h>

            2:    [#include](https://forums.oracle.com/ords/apexds/domain/dev-community?tags=include) \<stdio.h>

            3:    int main() {

            4:=>    int \*area1 = malloc(sizeof(int)\*16);

            5:      int \*area2 = malloc(sizeof(int)\*100);

            6:   

            7:      for (int i = 0; i \<= 16; i++)

    \_start() + 0x108

freed at:

    main() + 0x5c  \<t.c:10>

             7:      for (int i = 0; i \<= 16; i++)

             8:        area1\[i\] = 0;     // Array Out of Bounds

             9:   

            10:=>    free(area1);

            11:      area1\[0\] = 0;       // Freed Memory Access

            12:   

            13:      char \*area3 = malloc(sizeof(char)\*64);

    \_start() + 0x108

Listing 7.

The final error reported by discover in Listing 8 is the double freeing of area3 at lines 18 and 19 of Listing 3.

ERROR 4 (DFM): double freeing memory at address 0x300000021047e000:

    main() + 0xc8  \<t.c:19>

            16:      area1\[0\] = 0;       // Stale pointer access

            17:   

            18:      free(area3);

            19:=>    free(area3);

            20:   

            21:      return 0;

            22:    }

    \_start() + 0x108

was allocated at (64 bytes):

    main() + 0x74  \<t.c:13>

            10:      free(area1);

            11:      area1\[0\] = 0;       // Freed Memory Access

            12:   

            13:=>    char \*area3 = malloc(sizeof(char)\*64);

            14:      if ((void \*)area1 == (void \*)area3)

            15:       printf("New area3 is same as old area1\\n");

            16:      area1\[0\] = 0;       // Stale pointer access

    \_start() + 0x108

freed at:

    main() + 0xbc  \<t.c:18>

            15:       printf("New area3 is same as old area1\\n");

            16:      area1\[0\] = 0;       // Stale pointer access

            17:   

            18:=>    free(area3);

            19:      free(area3);

            20:   

            21:      return 0;

    \_start() + 0x108

DISCOVER SUMMARY:

    unique errors   : 4 (4 total)

Listing 8.

Robust Checking from Smart Algorithms and Probability

SSM uses only a subset of the bits in the pointer, and it still provides very robust error detection. Using a single bit, we get a 50 percent chance of a pointer matching a region of memory when it shouldn't. If we were to use two bits, we would have a 75 percent chance of catching an error. With three bits, we would have 87.5 percent chance, and so on.

However, that is true only if the colors were assigned randomly. The memory allocation routines give different colors to adjacent areas. Therefore, a buffer overflow into the neighboring area is detected 100 percent of the time. The security vulnerabilities caused by buffer overflows, such as those exploited by Heartbleed and Venom, are stopped every time. Freed-memory access is also caught reliably.

What's more, even stale-pointer access (a freed-memory access to an area that has been subsequently allocated for another purpose), which no software tool to date detects, is also caught nearly 100 percent of the time That is because in practice, a stale-pointer access happens very soon after a reuse of an area, and the area will have been assigned a new color upon its reuse. The area does not return to the original color until the memory management routines cycle through many allocations and freeing of the same area.

Conclusion

The hardware support provided by the Silicon Secured Memory feature of Oracle's SPARC M7 processor changes the game for application correctness. This hardware support means that applications run at nearly full speed; consequently the correctness of an application can be checked over full and extensive test suites. In this way, we can be nearly certain that all parts of the code have been exercised and tested.

The hardware support for memory error checking extends beyond what can be typically achieved with software instrumentation. So the SPARC M7 processor provides a very powerful combination of hardware-speed testing for a wide range of types of memory access errors.

See Also

About the Author

Raj Prakash is a senior software architect at Oracle. Currently he is the technical lead for several code analysis tools and global optimizers. His expertise is in developing tools designed to improve application security, performance, and scalability. He also writes a blog.

| Revision 1.0, 03/07/2016 |

Follow us:
Blog | Facebook | Twitter | YouTube

Comments

3411931

Hello,

If i want to mirroring your Oracle Linux, how i can do to become ?

Avi Miller-Oracle

Hello,

If i want to mirroring your Oracle Linux, how i can do to become ?

Edited to state that we are no longer adding mirror locations as the ISOs are now available via Akamai CDN.

3411931

You can send me a message and we can discuss it directly. We're only looking for new mirrors in locations we don't already have a mirror, so I'll need to know where your mirror will be based.

We are in Viet Nam. Can you tell me how much disk space it require to become your mirror ? Thanks you.

Avi Miller-Oracle

We are in Viet Nam. Can you tell me how much disk space it require to become your mirror ? Thanks you.

Currently, it requires about 500GB.

Shafqatktk01

Hi, I am interested to provide you mirror in Pakistan

Can you tell me what kind of system you need please mention your specification which is required for mirroring?

Avi Miller-Oracle

Hi, I am interested to provide you mirror in Pakistan

Can you tell me what kind of system you need please mention your specification which is required for mirroring?

Hi, we are not looking for additional mirrors at this time. Thanks for your interest, though!

CONCEPT21

"Oracle Software Delivery Cloud" does not work any more. It shows an empty page after I have logged in.  Please correct it. https://edelivery.oracle.com/osdc/faces/Home.jspx

Andris Perkons-Oracle

"Oracle Software Delivery Cloud" does not work any more. It shows an empty page after I have logged in.  Please correct it. https://edelivery.oracle.com/osdc/faces/Home.jspx

No problems here.

Andris

Avi Miller-Oracle

"Oracle Software Delivery Cloud" does not work any more. It shows an empty page after I have logged in.  Please correct it. https://edelivery.oracle.com/osdc/faces/Home.jspx

Works fine for me. The FAQ and contact details for the OSDC team are here: https://edelivery.oracle.com/osdc/faces/Faq.jspx#contactUs

Alex_D-Oracle

"Oracle Software Delivery Cloud" does not work any more. It shows an empty page after I have logged in.  Please correct it. https://edelivery.oracle.com/osdc/faces/Home.jspx

Connecting from Romania today, it's also working for me.

Dude!

"Oracle Software Delivery Cloud" does not work any more. It shows an empty page after I have logged in.  Please correct it. https://edelivery.oracle.com/osdc/faces/Home.jspx

withdrawn.

Dennis Wolff

Many thanks for this information.

My question: Is there also a list of ISO checksums (md5/sha) somewhere available?

Dude!

Many thanks for this information.

My question: Is there also a list of ISO checksums (md5/sha) somewhere available?

I checked some of the URLs and SHA-1/256 info is there as in the example below.

Except for OL 7.7 which appears missing.

What are you trying to download?

pastedImage_0.png

Dennis Wolff

I checked some of the URLs and SHA-1/256 info is there as in the example below.

Except for OL 7.7 which appears missing.

What are you trying to download?

pastedImage_0.png

I was downloading OL7.7 of course, but honestly not checking the other ISOs. But anyway ... thanks for your help.

Dude!

I was downloading OL7.7 of course, but honestly not checking the other ISOs. But anyway ... thanks for your help.

Below are the checksums for the OL 7.7 download from the Oracle Delivery Cloud. Chances are they apply.

Btw, you only need to download 339-01.iso which is bootable and contains the installation software.

pastedImage_0.png

Dennis Wolff

Below are the checksums for the OL 7.7 download from the Oracle Delivery Cloud. Chances are they apply.

Btw, you only need to download 339-01.iso which is bootable and contains the installation software.

pastedImage_0.png

Many thanks once more. Checksum was matching with my download from a public mirror.

Avi Miller-Oracle

I was downloading OL7.7 of course, but honestly not checking the other ISOs. But anyway ... thanks for your help.

I'll check to see why the checksum files are not up on the mirrors.

caf065b8-d16b-4851-997f-0fe6d640c089

um... i want to know, how many size is the oracle linux repo?

ladar

The v7 u9 and the v8 u3 ISOs don't appear to be on any of the mirror sites. My guess is they were never pushed to the server the various mirrors rsync with. Can anyone fix this? The official download server is soooo slow. Thanks.

ladar

Correction. The official download site is slow over HTTPS but reasonably fast over HTTP. It's a difference of 15 megabits vs 500 megabits.
I tested that via a gigabit link, and via a 10 gigabit link from different location and got the same result.

Sergio-Oracle

@ladar we will take a look. have you tried this location?
https://yum.oracle.com/oracle-linux-isos.html

ladar

@sergio-oracle1 yes, that is official download site I mentioned above, which is incredibly slow via HTTPS, but reasonably fast via HTTP.

Sergio-Oracle

@ladar Looks like the 7.9 and 8.3 ISOs propagated to the various mirrors now.

ladar

@sergio-oracle1 thanks for the update. I see the 7u9 ISO on my favorite mirror, but I still don't see the 8u3 ISO. I'll update my build scripts for v7, and check again in a day or so regarding v8.

User_LJZDY

Hello. Looks like HEAnet mirror does not contain oracle linux isos.

Sergio-Oracle

They used to mirror OL. I recommend you write them and suggest they add it back.
In addition to HTTP, we also serve FTP and rsync. If there's anything you think could benefit from being mirrored on ftp.heanet.ie, which has 10 Gigabit connectivity to the backbone in our National Network, please mail mirrors@heanet.ie

ladar

@Sergio-Oracle can you reach out to me? I'm setting up a public mirror server, and would like to add the Oracle Linux ISOs, and Yum repos.
It seems I don't have a good enough reputation on this platform to send you a private message, and I don't know your email address.
Thanks.

Sergio-Oracle

@ladar I've sent you a PM. Let me know if that didn't work.

User_3WO3S

I want to make a mirror out of Turkey. I'm waiting for your help. I contacted.

1 - 29

Post Details