Fixing C++ Exception Handling on NetBSD/vax

Here’s another repost from my old WordPress blog about fixing C++ exception support in NetBSD for VAX. Unfortunately, I got sidetracked by other projects, and then neglected my blog completely, so I used the Internet Archive to find the original to copy-and-paste here. I will have to return to this post and try to merge my fixes into the current version of NetBSD, along with any other needed GCC patches. Originally posted on Mar. 21, 2016.

Be forewarned that this post is extremely technical and even I don’t remember half of the things I was talking about. Most of my other posts are of a much more general interest than this one.

Readers of my older blog posts or Twitter stream may know that I’m a fan of keeping old hardware running with new open source software. Any 32-bit system from about 1990 or later that was well-designed in its era and has sufficient RAM (128MB is good, more is better) can still be put to good use today, especially with NetBSD, which runs on a wide variety of architectures

I’ve been very happy with the VAXstation 4000 Model 90 that I purchased from eBay back in 2000, which still runs completely reliably today. I had originally bought it intending to run OpenVMS, which it does, but that OS is pretty dated by today’s standards, especially the VAX version, which is missing all of the useful features added since 1995 to the Alpha and Itanium ports of that OS. I also own a few Alpha and Itanium servers to run OpenVMS if I’m feeling particularly strange, but NetBSD is a far more useful choice to run today, because you can recompile more programs to run on it with far fewer changes once the compiler, kernel, and system libraries are running reliably.

The biggest problem with NetBSD on the VAX today is that C++ exception handling wasn’t working, which meant that among other things you could not run the useful test suite that NetBSD includes, which uses a test harness called ATF that requires C++ exceptions to work for it to start. It took me several solid days of effort to figure out what needed to be fixed, but now I have patches to make C++ exceptions work (with GCC 4.8.5) on NetBSD 7.0 or later. I’ll send them to the relevant email lists after I publish this blog post. Three of the four patches are solid, but the last one is a hack that I suspect could be more cleanly fixed with an additional change to the VAX configuration files instead of where I patched it.

My hope is that by the end of the week, someone with NetBSD commit privileges will be able to check in the fixes so that C++ exception handling will be working on VAX for everyone out of the box. While it took me about 30 hours of effort to figure out what was going wrong, I can explain the fixes to make it work in a lot less time. The amazing thing is that 99% of the code was already working correctly, except for the bugs which I’ll now describe.

Here’s the C++ test program that I wrote to verify what was broken and discover when I had fixed it. It throws a few different basic types, along with a C++ class that I copied from the test framework code to throw a “file not found” exception:

#include <stdexcept>
#include <string>
#include <stdio.h>
#include <string.h>
#include <errno.h>
/* From external/bsd/atf/dist/tools/exceptions.hpp. */
namespace tools {
class system_error : public std::runtime_error {
    int m_sys_err;
    mutable std::string m_message;
    system_error(const std::string&, const std::string&, int);
    ~system_error(void) throw();
    int code(void) const throw();
    const char* what(void) const throw();
} // namespace tools
/* From external/bsd/atf/dist/tools/exceptions.cpp. */
tools::system_error::system_error(const std::string& who,
                                  const std::string& message,
                                  int sys_err) :
    std::runtime_error(who + ": " + message),
    const throw()
    return m_sys_err;
const char*
    const throw()
    try {
        if (m_message.length() == 0) {
            m_message = std::string(std::runtime_error::what()) + ": ";
            m_message += ::strerror(m_sys_err);
        return m_message.c_str();
    } catch (...) {
        return "Unable to format system_error message";
int recursive_throw(int i) {
  printf("enter recursive_throw(%d)\n", i);
  if (i > 0) {
    printf("calling recursive_throw(%d)\n", i - 1);
    recursive_throw(i - 1);
  } else {
    printf("throwing exception\n");
    throw 456;
  printf("exit recursive_throw(%d)\n", i);
/* Test several kinds of throws. */
int throwtest(int test) {
  switch (test) {
    case 0:
    case 1:
      return test;
    case 2:
      throw 123;
    case 3:
      throw 123.45;
    case 4:
      throw 678.9f;
    case 5:
      throw tools::system_error("::eaccess", "Cannot get info from file", ENOENT);
    case 6:
      return 432;  // fail
      return 999;  // not used in test
int main() {
  for (int i = 0; i < 7; i++) {
    try {
      int ret = throwtest(i);
      printf("throwtest(%d) returned %d\n", i, ret);
    } catch (int e) {
      printf("Caught int exception: %d\n", e);
    } catch (double d) {
      printf("Caught double exception: %f\n", d);
    } catch (float f) {
      printf("Caught float exception: %f\n", (double)f);
    } catch (const tools::system_error& e) {
      printf("caught const system_error&: code=%d\n", e.code());
      printf(" and e.what() is %s\n", e.what());

The first thing I had to do was figure out how C++ exceptions work on ELF systems, which include NetBSD, FreeBSD, Linux, Solaris, and many others. They use something called DWARF stack unwinding, which parses data that the compiler and linker store in the .eh_info section to describe how to modify the stack frame in the event of an exception so the correct C++ handler is called. I believe the same principle is used to handle exceptions in other GCC-supported language front-ends, such as Java and Go. I’m not sure how Ada exceptions work. For the purposes of this blog post, I’m only concerned with C++.

My test program initially crashed early in the stack unwinder in _Unwind_RaiseException() (defined in libgcc/unwind-dw2.c, included from libgcc/unwind-dw2.c), when the unwinder tried to follow some bad pointers. Fortunately, the DWARF stack frame info decoded correctly, so I won’t have more to say about that here. For more info, see the original Itanium C++ exception handling spec that every other architecture copied (except for ARM, which has some slight variations in the details), this series of blog posts dissecting how it works on x86, and Ian Lance Taylor’s blog posts from 2011 describing the format of the various tables used to unwind the stack.

Function calls on VAX UNIX use the CALLS instruction, which sets up a specific stack frame format, described in a comment in vax_expand_prologue() in gcc/config/vax/vax.c. I’ve pasted the entire function (including one change I made which I’ll describe later), as it also emits some CFA info which the assembler will use to generate the unwind tables whose details I’ve skipped over.

vax_expand_prologue (void)
  int regno, offset;
  int mask = 0;
  rtx insn;
  offset = 20;
  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
    if ((df_regs_ever_live_p (regno) && !call_used_regs[regno])
        || (crtl->calls_eh_return && regno >= 2 && regno < 4))
        mask |= 1 << regno;
        offset += 4;
  insn = emit_insn (gen_procedure_entry_mask (GEN_INT (mask)));
  RTX_FRAME_RELATED_P (insn) = 1;
  /* The layout of the CALLG/S stack frame is follows:
                <- CFA, AP
        ...     Registers saved as specified by MASK
        old fp
        old ap
        old psw
                <- FP, SP
     The rest of the prologue will adjust the SP for the local frame.  */
  add_reg_note (insn, REG_CFA_DEF_CFA,
                plus_constant (Pmode, frame_pointer_rtx, offset));
  insn = emit_insn (gen_blockage ());
  RTX_FRAME_RELATED_P (insn) = 1;
  vax_add_reg_cfa_offset (insn, 4, gen_rtx_REG (Pmode, PSW_REGNUM));
  vax_add_reg_cfa_offset (insn, 8, arg_pointer_rtx);
  vax_add_reg_cfa_offset (insn, 12, frame_pointer_rtx);
  vax_add_reg_cfa_offset (insn, 16, pc_rtx);
  offset = 20;
  for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
    if (mask & (1 << regno))
        vax_add_reg_cfa_offset (insn, offset, gen_rtx_REG (SImode, regno));
        offset += 4;
  /* Allocate the local stack frame.  */
  size = get_frame_size ();
  emit_insn (gen_addsi3 (stack_pointer_rtx,
                         stack_pointer_rtx, GEN_INT (-size)));
  /* Do not allow instructions referencing local stack memory to be
     scheduled before the frame is allocated.  This is more pedantic
     than anything else, given that VAX does not currently have a
     scheduling description.  */
  emit_insn (gen_blockage ());

My initial investigations showed that the .cfi data was correctly emitted to the assembler file and the final executables, and that the data structures were being restored into memory. Compiling libgcc.a and libstdc++.a with debugging info (-g) and linking statically (-static) made GDB work correctly, so I was able to set breakpoints, disassemble the code, see the value of the registers, and so on. This saved a tremendous amount of time.

The cause of the first crash was that the initial call to uw_init_context_1() from uw_init_context() in libgcc/unwind-dw2.c calls __builtin_dwarf_cfa(), a GCC builtin that needs to return the value of CFA. This was returning %fp (frame pointer), but it should have been returning %ap (argument pointer), which is a separate register on VAX, and already points directly to the start of the CFA (the CFA pointer is a DWARF concept, not a VAX one).

My first patch was to replace #define FRAME_POINTER_CFA_OFFSET(FNDECL) 0 with #define ARG_POINTER_CFA_OFFSET(FNDECL) 0 in gcc/config/vax/elf.h and gcc/config/vax/vax.h. Actually, I removed the old definition from elf.h and vax.h and added ARG_POINTER_CFA_OFFSET to vax.h only.

Index: gcc/config/vax/vax.h
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/vax.h,v
retrieving revision 1.3
diff -u -u -r1.3 vax.h
--- gcc/config/vax/vax.h	23 Sep 2015 03:39:18 -0000	1.3
+++ gcc/config/vax/vax.h	21 Mar 2016 12:06:26 -0000
@@ -168,12 +168,12 @@
 /* Base register for access to local variables of the function.  */
-/* Offset from the frame pointer register value to the top of stack.  */
 /* Base register for access to arguments of the function.  */
+/* Offset from the argument pointer register value to the CFA.  */
 /* Register in which static-chain is passed to a function.  */

Here’s my diff to gcc/config/vax/elf.h, which includes partial patches for two more bugs I’ll discuss next.

Index: gcc/config/vax/elf.h
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/elf.h,v
retrieving revision 1.3
diff -u -u -r1.3 elf.h
--- gcc/config/vax/elf.h	23 Sep 2015 03:39:18 -0000	1.3
+++ gcc/config/vax/elf.h	21 Mar 2016 12:09:10 -0000
@@ -45,18 +45,8 @@
    count pushed by the CALLS and before the start of the saved registers.  */
-/* Offset from the frame pointer register value to the top of the stack.  */
-/* We use R2-R5 (call-clobbered) registers for exceptions.  */
-#define EH_RETURN_DATA_REGNO(N) ((N) < 4 ? (N) + 2 : INVALID_REGNUM)
-/* Place the top of the stack for the DWARF2 EH stackadj value.  */
-#define EH_RETURN_STACKADJ_RTX						\
-  gen_rtx_MEM (SImode,							\
-	       plus_constant (Pmode,					\
-			      gen_rtx_REG (Pmode, FRAME_POINTER_REGNUM),\
-			      -4))
+/* We use R2-R3 (call-clobbered) registers for exceptions.  */
+#define EH_RETURN_DATA_REGNO(N) ((N) < 2 ? (N) + 2 : INVALID_REGNUM)
 /* Simple store the return handler into the call frame.  */
 #define EH_RETURN_HANDLER_RTX						\
@@ -66,10 +56,6 @@
-/* Reserve the top of the stack for exception handler stackadj value.  */
 /* The VAX wants no space between the case instruction and the jump table.  */

After changing FRAME_POINTER_CFA_OFFSET to ARG_POINTER_CFA_OFFSET, the next crash of my C++ test program occurred inside the G++ personality function, __gcc_personality_v0(), defined in libgcc/unwind-c.c, when it tried to write values into __builtin_eh_return_data_regno (0) and __builtin_eh_return_data_regno (1) to return to the caller. Those are two CPU registers that I need to restore when the exception handler returns.

On VAX, EH_RETURN_DATA_REGNO(N) was defined as R2 to R5, but I changed the definition (see previous patch) to only include R2 and R3 because I needed to add those registers to the procedure entry mask next to make it work. The good news is that GCC defines an internal boolean, crtl->calls_eh_return, which I was able to test inside vax_expand_prologue() to only add R2/R3 to the procedure entry mask for C/C++ functions which actually call __builtin_eh_return (), which includes the frame we’re going to return from later at the end of _Unwind_RaiseException(). I included my change with the vax.c source earlier, so here’s the diff file showing just the patch.

Index: gcc/config/vax/vax.c
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/vax.c,v
retrieving revision 1.3
diff -u -u -r1.3 vax.c
--- gcc/config/vax/vax.c	23 Sep 2015 03:39:18 -0000	1.3
+++ gcc/config/vax/vax.c	21 Mar 2016 19:52:21 -0000
@@ -164,7 +164,8 @@
   offset = 20;
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (df_regs_ever_live_p (regno) && !call_used_regs[regno])
+    if ((df_regs_ever_live_p (regno) && !call_used_regs[regno])
+	|| (crtl->calls_eh_return && regno >= 2 && regno < 4))
         mask |= 1 << regno;
         offset += 4;

The next bug is more of an optimization, which is that I removed the definition of EH_RETURN_STACKADJ_RTX and STARTING_FRAME_OFFSET in config/vax/elf.h because they’re not needed on VAX. This code has been adding an extra 4 bytes to every stack frame to reserve storage for an offset value for stack unwinding that VAX doesn’t need because it has a separate frame pointer register that’s used on function return. One benefit of this is that %sp can be lowered within a function to reserve additional space on the stack, and it doesn’t have to be restored on exit. On VAX, the RET instruction restores the previous stack frame base from the %fp register, not %sp, which can have any value.

I also cleaned up some suspicious code in config/vax/ that was trying to deal with this exception handling stack adjustment variable that’s not needed. Here’s that patch.

Index: gcc/config/vax/
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/,v
retrieving revision 1.3
diff -u -u -r1.3
--- gcc/config/vax/	23 Sep 2015 03:39:18 -0000	1.3
+++ gcc/config/vax/	21 Mar 2016 19:57:38 -0000
@@ -436,7 +436,7 @@
   "vax_expand_addsub_di_operands (operands, MINUS); DONE;")
 (define_insn "sbcdi3"
-  [(set (match_operand:DI 0 "nonimmediate_addsub_di_operand" "=Rr,=Rr")
+  [(set (match_operand:DI 0 "nonimmediate_addsub_di_operand" "=Rr,Rr")
 	(minus:DI (match_operand:DI 1 "general_addsub_di_operand" "0,I")
 		  (match_operand:DI 2 "general_addsub_di_operand" "nRr,Rr")))]
@@ -786,6 +786,9 @@
 ;; These handle aligned 8-bit and 16-bit fields,
 ;; which can usually be done with move instructions.
+;; netbsd changed this to REG_P (operands[0]) || (MEM_P (operands[0]) && ...
+;; but gcc made it just !MEM_P (operands[0]) || ...
 (define_insn ""
   [(set (zero_extract:SI (match_operand:SI 0 "register_operand" "+ro")
 			 (match_operand:QI 1 "const_int_operand" "n")
@@ -1306,6 +1309,11 @@
   "decl %0\;jgequ %l1")
+;; Note that operand 1 is total size of args, in bytes,
+;; and what the call insn wants is the number of words.
+;; It is used in the call instruction as a byte, but in the addl2 as
+;; a word.  Since the only time we actually use it in the call instruction
+;; is when it is a constant, SImode (for addl2) is the proper mode.
 (define_expand "call_pop"
   [(parallel [(call (match_operand:QI 0 "memory_operand" "")
 		    (match_operand:SI 1 "const_int_operand" ""))
@@ -1314,24 +1322,17 @@
 			    (match_operand:SI 3 "immediate_operand" "")))])]
-  gcc_assert (INTVAL (operands[3]) <= 255 * 4 && INTVAL (operands[3]) % 4 == 0);
-  /* Operand 1 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[1] = GEN_INT (INTVAL (operands[3]) + 4);
+  gcc_assert (INTVAL (operands[1]) <= 255 * 4);
+  operands[1] = GEN_INT ((INTVAL (operands[1]) + 3) / 4);
 (define_insn "*call_pop"
   [(call (match_operand:QI 0 "memory_operand" "m")
 	 (match_operand:SI 1 "const_int_operand" "n"))
    (set (reg:SI VAX_SP_REGNUM) (plus:SI (reg:SI VAX_SP_REGNUM)
-					(match_operand:SI 2 "immediate_operand" "i")))]
+			     (match_operand:SI 2 "immediate_operand" "i")))]
-  operands[1] = GEN_INT ((INTVAL (operands[1]) - 4) / 4);
-  return "calls %1,%0";
+  "calls %1,%0")
 (define_expand "call_value_pop"
   [(parallel [(set (match_operand 0 "" "")
@@ -1342,12 +1343,8 @@
 			    (match_operand:SI 4 "immediate_operand" "")))])]
-  gcc_assert (INTVAL (operands[4]) <= 255 * 4 && INTVAL (operands[4]) % 4 == 0);
-  /* Operand 2 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[2] = GEN_INT (INTVAL (operands[4]) + 4);
+  gcc_assert (INTVAL (operands[2]) <= 255 * 4);
+  operands[2] = GEN_INT ((INTVAL (operands[2]) + 3) / 4);
 (define_insn "*call_value_pop"
@@ -1357,47 +1354,20 @@
    (set (reg:SI VAX_SP_REGNUM) (plus:SI (reg:SI VAX_SP_REGNUM)
 					(match_operand:SI 3 "immediate_operand" "i")))]
-  "*
-  operands[2] = GEN_INT ((INTVAL (operands[2]) - 4) / 4);
-  return \"calls %2,%1\";
-(define_expand "call"
-  [(call (match_operand:QI 0 "memory_operand" "")
-      (match_operand:SI 1 "const_int_operand" ""))]
-  ""
-  "
-  /* Operand 1 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[1] = GEN_INT (INTVAL (operands[1]) + 4);
+  "calls %2,%1")
-(define_insn "*call"
-   [(call (match_operand:QI 0 "memory_operand" "m")
-	  (match_operand:SI 1 "const_int_operand" ""))]
+;; Define another set of these for the case of functions with no operands.
+;; These will allow the optimizers to do a slightly better job.
+(define_insn "call"
+  [(call (match_operand:QI 0 "memory_operand" "m")
+	 (const_int 0))]
   "calls $0,%0")
-(define_expand "call_value"
-  [(set (match_operand 0 "" "")
-      (call (match_operand:QI 1 "memory_operand" "")
-	    (match_operand:SI 2 "const_int_operand" "")))]
-  ""
-  "
-  /* Operand 2 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[2] = GEN_INT (INTVAL (operands[2]) + 4);
-(define_insn "*call_value"
+(define_insn "call_value"
   [(set (match_operand 0 "" "")
 	(call (match_operand:QI 1 "memory_operand" "m")
-	      (match_operand:SI 2 "const_int_operand" "")))]
+	      (const_int 0)))]
   "calls $0,%1")

Almost finished! The final bug is the one that I understand the least. My patch is only one line but I don’t think I’m solving the problem correctly. I look forward to hearing from GCC experts to explain what the correct fix should be. The difference between the special __builtin_eh_return() call that’s used by GCC to return from the C++ exception throwing case and a normal return is that the return address on the previous stack frame needs to be modified to jump to the exception handler. What I was seeing after making all the previous changes to fix the other bugs is that my test program failed to catch any exceptions, but instead returned normally to the original return path.

Investigation revealed that GCC was correctly generating the necessary move instruction to copy the second parameter passed to __builtin_eh_return() into the return address, because EH_RETURN_HANDLER_RTX had been defined correctly in config/vax/elf.h. Here’s the relevant section at the end of expand_eh_return(), defined in gcc/except.c.

/* Expand __builtin_eh_return.  This exit path from the function loads up
   the eh return data registers, adjusts the stack, and branches to a
   given PC other than the normal return address.  */
expand_eh_return (void)
  rtx around_label;
  /* ... */
#ifdef HAVE_eh_return
  if (HAVE_eh_return)
    emit_insn (gen_eh_return (crtl->eh.ehr_handler));
      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
      error ("__builtin_eh_return not supported on this target");
  emit_label (around_label);

Some of the architectures supported by GCC define an eh_return() function in their .c or .md file to generate a special epilogue for this case, but it turns out that on VAX, as for other architectures, the EH_RETURN_HANDLER_RTX path was generating the correct move instruction. The problem was that the optimizer is deleting the final move instruction when I compile with -O or higher. The assembly code at -O0 (no optimization) generated for the __builtin_eh_return() call at the end of _Unwind_RaiseException() looked like:

        calls $2,_Unwind_DebugHook
        movl -12(%fp),%r1
        movl %r1,16(%fp)

But then when I compiled with -O1 or -O2, all I saw was:

        calls $2,_Unwind_DebugHook

This was a mystery for me and I don’t know enough about how the peephole optimizer works to track down why it thinks it can remove the move call to store the previous return address. My workaround was to add a call to RTX_FRAME_RELATED_P (insn) = 1; after the emit_move_insn() in gcc/except.c, which was used in vax_expand_prologue() to mark the procedure entry mask. Here’s my patch to do that:

Index: gcc/except.c
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/except.c,v
retrieving revision 1.3
diff -u -u -r1.3 except.c
--- gcc/except.c	23 Sep 2015 03:39:10 -0000	1.3
+++ gcc/except.c	21 Mar 2016 20:12:10 -0000
@@ -2207,7 +2207,8 @@
-      emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+      RTX_FRAME_RELATED_P (insn) = 1;
       error ("__builtin_eh_return not supported on this target");

By making this change, the optimizer no longer removes the call to write the value to the previous stack pointer, but it adds an extra line of .cfi exception info, which seems unnecessary since the code is immediately going to return from the call and any adjustment made by the DWARF stack unwinder will already have been done. Here’s what the optimized code looks like with the patch:

        calls $2,_Unwind_DebugHook
        movl %r6,16(%fp)
        .cfi_offset 6, -36

With that final change, C++ exception handling now finally works on NetBSD/vax, and I was able to successfully run the vast majority of the tests in the ATF testsuite, which had been completely inaccessible when I started due to both atf-run and atf-report immediately dumping core due to the bad pointers that I fixed. Now I have a bunch of new bugs to track down fixes for, but I think this was the hardest set of problems that needed to be solved to bring NetBSD on VAX up to the level of the other NetBSD ports.

With the exception of the hack to gcc/except.c, the patches I’ve posted are ready to check in to NetBSD as well as to upstream GCC. The fix I’d like to see for the bug of the emit_move_insn() being deleted by the optimizer would be a patch to one of the files in the gcc/config/vax directory to explain to the optimizer that writing to 16(%fp) is important and not something to be omitted from the function epilogue. I didn’t see any indication that any other GCC ports required anything special to tell it not to delete the move instruction into EH_RETURN_HANDLER_RTX, so another suspicion I have may be a bug specific to VAX’s peephole optimizer or other functions. Any ideas?

I apologize for not going into a great deal of additional detail, but this blog post is already 3000 words. Thanks for reading.






Leave a Reply

Your email address will not be published. Required fields are marked *