Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tend to go with:

  gcc -c -Wall -Werror -std=c99 -pedantic -O3
I use -std=c99 because I use these two features of C:

1. Mixed declarations and code, e.g.

  double x = 4.8 * 5.3;
  printf("x = %.15g\n", x);
  double y = 8.7 * x;
  printf("y = %.15g\n", y);
2. Flexible array members, e.g. for my safe string operations:

  struct str 
      {       
      long len;
      char data[];
      };
If I were to use -ansi (same as -std=c89), instead of -std=c99, then -pedantic would give me these errors:

  error: ISO C90 forbids mixed declarations and code [-Werror=edantic]

  error: ISO C90 does not support flexible array members [-Werror=edantic]
(By the way, I have no idea why the error message omits the "p" from pedantic there. That doesn't smell right. I hope the gcc people fix that.)

I use -O3 for optimization, and I chose level 3 because that enables -finline-functions. I typically avoid macros, even for simple one-liners like this:

  /* Increment the reference count. */
  void hold(value f)
      {
      f->N++;
      }
With -finline-functions enabled (via -O3), I can see that function being expanded inline, by examining the assembly output of gcc -S -O3.


Why don't you use -O2 -finline-functions then?


Good question. I go with -O3 because it does even more optimizations, but to be quite candid I really don't know what impact -fpredictive-commoning, -ftree-vectorize, and others have on my resulting machine code, if any.

I did a quick experiment, compiling one C file with -O2 -finline-functions and another with -O3, using the -S flag so I could see the assembler output.

The only difference I saw was this:

  .comm	free_list,8,8
Versus this:

  .comm	free_list,8,16
Who knows. I guess I'm really using -O3 because "3 is more than 2" -- in other words, "Ours goes to 11!" :)


predictive commoning is a loop optimization that commons cross-loop redundancies.

It is basically CSE "around loops". You can generalize it to subsume loop store motion and strength reduction, but most compilers (including GCC) don't bother.

For example, it transforms

  for (int i = 0; i < 50; i++)
    a[i+2] = a[i] + a[i+1]
into

  p0 = a[0]
  p1 = a[1]
  for (int i = 0; i < 50; i++) {
     a[i+2]=p2=p0+p1;
     p0=p1;
     p1=p2;
  }
Eliminating a whole ton of loads and stores.

It's been a while since i looked at GCC's implementation, but it did pretty well in the past (whether you can do commoning depends on your ability to identify and group sequences, etc)

-ftree-vectorize does the obvious thing (turn on vectorization). How effective it is depends on a lot of factors.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: