|
Tuesday, November 28, 2006 The Problem with ProgrammingBjarne Stroustrup, the inventor of the C++ programming language, defends his legacy and examines what's wrong with most software code. By Jason Pontin
In the 1980s and 90s, Bjarne Stroustrup designed and implemented the C++ programming language, which popularized object-oriented programming and influenced numerous other programming languages, including Java. C++ remains the archetypal "high level" computer language (that is, one that preserves the features of natural, human language), and it is still used by millions of programmers. Many of the systems and applications of the PC and Internet eras were written in C++. For all that, the language remains controversial, largely because it is notoriously difficult to learn and use, and also because Stroustrup's design allows developers to make serious programming mistakes in the interest of preserving their freedom. Stroustrup, for many years a researcher at AT&T Bell Labs, is now a professor of computer science in the Department of Engineering, at Texas A&M University, near Houston. Technology Review: Why is most software so bad? Bjarne Stroustrup: Some software is actually pretty good by any standards. Think of the Mars Rovers, Google, and the Human Genome Project. That's quality software! Fifteen years ago, most people, and especially most experts, would have said each of those examples was impossible. Our technological civilization depends on software, so if software had been as bad as its worst reputation, most of us would have been dead by now. On the other hand, looking at "average" pieces of code can make me cry. The structure is appalling, and the programmers clearly didn't think deeply about correctness, algorithms, data structures, or maintainability. Most people don't actually read code; they just see Internet Explorer "freeze." I think the real problem is that "we" (that is, we software developers) are in a permanent state of emergency, grasping at straws to get our work done. We perform many minor miracles through trial and error, excessive use of brute force, and lots and lots of testing, but--so often--it's not enough. Software developers have become adept at the difficult art of building reasonably reliable systems out of unreliable parts. The snag is that often we do not know exactly how we did it: a system just "sort of evolved" into something minimally acceptable. Personally, I prefer to know when a system will work, and why it will. TR: How can we fix the mess we are in? BS: In theory, the answer is simple: educate our software developers better, use more-appropriate design methods, and design for flexibility and for the long haul. Reward correct, solid, and safe systems. Punish sloppiness. In reality, that's impossible. People reward developers who deliver software that is cheap, buggy, and first. That's because people want fancy new gadgets now. They don't want inconvenience, don't want to learn new ways of interacting with their computers, don't want delays in delivery, and don't want to pay extra for quality (unless it's obvious up front--and often not even then). And without real changes in user behavior, software suppliers are unlikely to change. We can't just stop the world for a decade while we reprogram everything from our coffee machines to our financial systems. On the other hand, just muddling along is expensive, dangerous, and depressing. Significant improvements are needed, and they can only come gradually. They must come on a broad front; no single change is sufficient. One problem is that "academic smokestacks" get in the way: too many people push some area as a panacea. Better design methods can help, better specification techniques can help, better programming languages can help, better testing technologies can help, better operating systems can help, better middle-ware infrastructures can help, better understanding of application domains can help, better understanding of data structures and algorithms can help--and so on. For example, type theory, model-based development, and formal methods can undoubtedly provide significant help in some areas, but pushed as the solution to the exclusion of other approaches, each guarantees failure in large-scale projects. People push what they know and what they have seen work; how could they do otherwise? But few have the technical maturity to balance the demands and the resources.
|

Comments
dbarthel on 11/28/2006 at 11:42 AM
1
bmn on 11/28/2006 at 5:07 PM
25
b_russel on 11/28/2006 at 8:08 PM
2
OsGuy on 11/29/2006 at 1:56 PM
2
brianr987 on 12/05/2006 at 4:08 AM
1
DarkWater on 12/05/2006 at 4:36 PM
1
rjmcguire on 12/08/2006 at 5:03 AM
1
asw_tr on 11/28/2006 at 5:33 PM
2
I find personal attacks, such as "living in denial", less useful than evidence to support your points.
wakko4cpp on 11/29/2006 at 8:21 AM
1
zhaopian on 11/29/2006 at 4:44 PM
1
Selling a finished, quality, reliable product is VASTLY easier than one on paper - and being able to sell it NOW, one can ask for more $$ than the startup who is promising it in 6 months. When you've completed 30yrs in this business, you might appreciate the above comments.
Z.
appleg on 11/30/2006 at 11:07 AM
1
The single biggest cause of programming errors is in imporper use of pointers and memory mis-management. The face that C++ allows programmers (and forces programmers) direct access to addresses and allows pointer arithmetic means that undisciplined coders have more than enough rope to hang themselves. However driver writers must have this low-level of control.
C++ however tries to satisfy device driver writers and at the same time to be an abstract object oriented language for extremely large projects. To mix bit flipping and multiple inheritence puts the burden of developing quality, high-performance, reliable software on the corporate process (The Bell Labs processes were extremely rigid and followed religiously until the ATT/Lucent disaster).
No compiler and no language can compensate for a lack of discipline in a project's software engineering practices.
cmcknight on 12/01/2006 at 5:17 PM
2
Stroustrop is not living in denial, he is merely pointing out that the trend for the last ten years has been to dumb down the languages because most developers aren't capable of handling a language that requires a large amount of discipline. Unfortunately, educators are as much to blame as anyone for this trend because they have taken the easy way out to teach job skills to get a minimum wage programming job right out of school rather than computer science.
Sadly, most people won't take responsibility for their own shortcomings and want to blame things on just about everyone/everything else. Can you write bad code in C++? Sure, no problem. The old joke of C++ enables you to create multiple instances of the gun to concurrently shoot yourself in the foot is really a truism.
The fascination with abstraction for its own sake which Stroustroup refers to in the interview in rampant in too many languages / libraries / frameworks / etc. so that the lowest common denominator can make a huge, bloated, underperforming mess of things. It's frightening to realize that we have more computing power at our fingertips than at any time in the past and yet badly written software (as Stroustrop notes) has virtually eliminated all of those gains.
I've heard developers say "first you make it work, then you make it work fast" as if performance (and security for that matter) are bolt-on items. The fallacy here is that if you don't design a system for performance (and security) from the outset, then you will not get it by hacking on it after the fact. Quoting "get there first with most" is just another cop-out for not having the discipline to do it right the first time because apparently that's too hard and takes away from the developers who fancy themselves misunderstood artists instead of engineers.
And yet there are individuals, like yourself, who don't blink an eye when saying "we just need more/bigger hardware and the performance problem will be eliminated." Now who's living in denial?
pkghosh on 12/04/2006 at 4:37 PM
1
Pranab
BRoyds on 12/05/2006 at 4:37 PM
1
Any necessary I/O or character manipulation abilities is available in function libraries, not as part of the language and so not capable of being checked for correctness as part of the language compile.
As well C does not have true arrays (arrays in C are syntactic sugar for pointers), nor proper strings (strings are just a sequence of 8 bit integers finished by the number 0), although it does have syntax for converting a character string to such a sequence.
This design (unbounded arrays, uncounted strings) has been the cause of most security problems of the last 30 years (buffer overflows only happen in C or libraries written in C, for example).
C++, unfortunately, kept the flaws in C as well as its good points. It did add many good features to encapsulate objects, and provide many conceptual tools. But in many places, raw C shows though, muddying what could have been a clean object language.
And example is the syntax for declaring a pure virtual function.
In C++ the syntax is
resultType Foo() = 0;
which confuses variable assignment with function definition (why is an integer 0 also a valid function declaration?).
It still kept the confusion in C between arrays and pointers, so that one can not declare an array with more than one dimension without declaring an array of arrays.
int A[3,4];
is an error rather than a guaranteed contiguous piece of memory on which slices and other array functions can work efficiently. Fortran still handles mathematical calculations more efficiently than C.
The main reason C++ is used today is because it is an extension of C. Its problems are also for the same reason.
zhrinze on 12/09/2006 at 8:24 PM
1
If you had done any assembler programming you'd know that many assemblers terminate strings, defined as a sequence of characters, with a null character. Long before there were OS functions that handled output, characters were sent to the output device one at a time in a loop that terminated when the null character was reached.
Your ignorance continues by saying that the libraries cannot be checked during compile. Apparently you know nothing about compiler design, or you'd know that the prototypes are checked as tightly as you direct the compiler to check. The functions in the libraries have already passed a compile and do not get recompiled. If they fail in your program, it's because the maximum syntax and language checking was not done while these libraries were compiled to begin with.
C++ was originally "compiled" to C code and then the C compiler compiled it to object code for the linker.
You may have limited experience with languages, but some of us do not. I've worked on platforms ranging from a Cray-XMP to an IBM 4381 (and later a 3090). I've programmed for four-bit and eight-bit processors including the 6502 in the Apple II, the 8080 in several machines running CP/M, the Motorola 68000 in Macs, and all incarnations of the intel x86 product line. I've done systems programming for the IBM AS/400 and for customized linux installations. I ported compilers to oddball platforms when many did not exist and still don't.
Programmers who were taught to write quality code and who told bosses who wanted quantity over quality to get lost often lost their jobs. The mass market demanded software, and they got what they asked for. In fact, I can't think of too many products on the market that don't fit that paradigm.
Crap will be crap until people make it better.
gandor on 12/12/2006 at 4:51 AM
1
To sell quality you need a market that wants quality.
I think there are markets where quality matters, and where it is demanded, eg in consumer electronics.
All the ideas how to develop high quality depends on someone demanding it. That's it.
TechMonk on 12/06/2006 at 12:29 AM
1
mthomas on 01/15/2007 at 7:21 PM
1
I think that the problem that people generally have with C++ is that they don't truly learn it before using it. Languages such as Java and C# protect a person from the typical mistakes that can be made in writing C++ code (garbage collection). However, that isn't to say that the very same people write good code in Java or C# either.
I enjoy writing code in C++ because of the challenge it gives me. However, languages such as Java and C# are not inferior. Java and C# are from a different era. I don't like all this negativity towards C++. I feel that it's unfair to make a critical account of the differences between C++ and C# because of the difference in maturity of both languages. Could one compare a mobile of ten years ago with one of today and be making a fair comparison?
I conduct many code reviews within my organistation and I'm finding that more and more people are making silly mistakes in C++ because they are also using C#. I also conduct interviews for Senior C++ development roles and find that it's almost impossible to find enthusiastic, experienced senior C++ developers. Interesting...
pcr on 05/30/2007 at 6:37 PM
3
1. The programmers involved in a project had no OOP skills and this is a time bomb for a C++ project. You can write C code without reading tons of books and articles, but an efficient OOP design requires some experience and good OOP skills.
2. The project was not so important in terms of reusability, maintainance, testing or so. If you just want to code and delivery to client, then the lifecycle of C++ programming seems to be longer than C. But if you write code for a huge project which spread on multiple years, with multiple changes in requirments and request, you can see better advantages of OOP programming.
C++ has clear problems which Java and C# solves, or, more exactly, their VMs. But C++ remains a great OOP language for those wants to use it properly.
bobhargraves on 11/28/2006 at 12:39 PM
7
wkaras on 11/28/2006 at 6:11 PM
8
If there is no need for a language with the "dangerous" capablilities that C++ has, what language do you suggest the byte-code interpreter be written in?
A relevant analogy might be the way, in the Middle Ages, Latin was used for certain highly specialized topics (law, medicine, science), and the local language for general purposes. This approach was abandoned in favor of using a single language with special terminology for certain topics.
Instead of having multiple programming languages, wouldn't it be better to have tools that help make sure dangerous features are used properly? Such tools could restrict the use of built-in arrays to certain files, or flag all usage of "new" for careful scrutiny during code review. This would be analogous to the editor who checks to make sure that medical or legal jargon is not being used in a general-circulation newspaper.
b_russel on 11/28/2006 at 8:04 PM
2
To be fair, you'll have to admit that languages like Pascal, OCaml or Ada have been around for years, and they all have either bytecode interpreters or native compilers written in Pascal, OCaml or Ada.
wkaras on 11/29/2006 at 11:05 AM
8
dolio on 11/30/2006 at 12:42 AM
1
C certainly does have its place, and perhaps C++ does, too. But people use both far outside of the domains for which they work the best. If you're at all interested in safety, abstraction, and preventing stupid errors---and such things are probably more interesting than absolutely maximizing performance for all but a few things---then languages like Haskell, OCaml, Ada, etc. are easily the right choice over C.
No amount of code scanning is going to make it as easy to write correct programs in C as it is in those languages, and the performance advantages of C are shrinking as well (hell, they may disappear when we end up with lots of cores, which can be used easily via data parallelism in higher level languages, but requires lots of confusing threading to use in C).
wkaras on 11/30/2006 at 12:08 PM
8
Generally speaking, not doing something is easy, regardless of the language you are not doing it in. It seems likely that using a single language with support tools to control if/when certain features are used is more simple/flexible/conducive to economies of scale than using multiple languages with multiple "safety" levels.
yacc on 12/05/2006 at 3:24 AM
1
The problem, is that C++ is an unnice language, and while I'm a college dropout (which happens to know C++ anyway), there is a disturbing aspect about the people I'd accept on a C++ project. Hmm. 2 guys I know which could discuss the finer points of generic programming in C++. And one guy with whom I debugged memory errors in a huge C++ app.
Guess what these guys all have in common. Let's start, guess a Ph.D. in CS and a good number of years of work experience?
The problem basically is, that the lowlevel stuff is well enough covered by other languages, e.g. C.
And all the high-level stuff is scary enough without being added-on to an "unsafe" base language.
Now my current project has reminded me, that developers can be the "enduser" too (if one does a language or framework), and one needs to consider their requirements.
Guess what, no matter how much we want to whine about these facts, most "commercial" developers have not enough capabilities to think in really abstract terms. (Although I guess the numbers of guys knowing C++ good enough to work safely with it is still higher than count on the guys knowing predicate logic :) )
The point is, that C++ is a huge language, containing many warts in the name of performance.
Now many propose that one can program effectivly in C++ without knowing the whole language in detail. That's not so. First you have libraries/old code. You need to know everything that's in there. In theory you don't need to know that stuff by heart. In practice you do, because you need to understand it when troubles happen. Be it a cryptic error from the compiler (about not finding the right method for these types), or even worse be it an obscure memory handling bug.
So basically to repeat the problem: C++ misses completly on the requirements for 99% of commercial software developement.
And btw, software maintainability is usually the most important criteria, performance is way down the line and raises in the stack only when the stuff is not fast enough.
Now to the details, what languages should operating systems and VMs be written in? Well, C/C++/Modula3 come to mind. And even here the complexity of C++ is usually not worth the hassle.
E.g. see Modula3 for a high-level language that supports all the lowlevel manipulations needed for operating system stuff in a sane way.
Why are VMs written in C++? Who says so? Smalltalk usually has it's VM written in a Smalltalk subset. Python has it's own written in C, lua is Ansi-C too. The only one I'm not sure at the moment is Java. Actually, all the complexity of C++ makes it a bad glue language too, so the lowlevel stuff is often written in C.
(Btw, for Python, there is a project underway to express the VM in a Python subset that gets automatically compiled to C)
And well the performance myth. The truth is, that in most practical cases, using a more highlevel language, perhaps with some lowlevel helper modules (or not), is fast enough. Actually it's usually faster, at least when it comes to developers time. And with all these man months free, these developers have to time to think about better algorithms, micro optimizing and the general design of your software.
(Again my personal experience shows in at least two project that I can beat C++ performance, even for atypical tasks like parsing and interpreting huge binary data recordings in Python. And yes, in this case I did it in one third of the time the C++ guys used up before giving up on the complexity of the files involved)
So the niche for C++, where I would recommend it for a new project is very small. On the lowlevel side, C is often a better solution, as it has no surprise-surprise effect for developers. And on the high-end side, Java and dynamic languages like Python, Ruby and Perl are usually better suited.
And even for that small niche, where you need big systems with many lowlevel parts, there are other alternatives (e.g. Modula3, Java-to-platform-binary compilers).
One last thought before I close. One has to consider C++ as it is, and not as what it could be. E.g. that means usually one string class for every second library you use. Combined with the fact that C++ normally does not include garbage collection, which forces library developers to include informal memory ownership into their APIs, this leads often to a situation that a cool C++ program has to copy strings multiple times where a Python or Java program passes just a pointer around.
bunbun on 12/05/2006 at 7:23 AM
1
When this is not appreciated, programmers in c++ suffer especially. This is partly because there are some "C" relics in the language which should never be touched in normal use, and partly because the flexibility of the language sometimes encourages an "anything goes" mentality among programmers.
Idiomatic c++ should be clear and safe, corresponding almost line for line to "high level" scripting languages like python but with dramatic greater possible performance.
There should be a preference for use of the standard library. No arrays but vectors. No char* but std::string.
There should be no explicit memory handling. Instead all programme resources (including files, memory, OS handles, painting DCs) should have clear lifetimes either by being created within local scope (on the stack) or as members of objects which themselves should have clear lifetimes. They can be managed using smart pointers (http://www.boost.org/libs/smart_ptr/smart_ptr.htm) which also cover rare cases (shared ownership, cyclical ownership etc.)
Memory handling on legacy programmes created without a clean design can be patched by dropping in Hans Boehm's free garbage collection library.
There should be no typecasting.
Naked pointers should not be used.
There are free and high quality peer reviewed libraries for regular expression matching, graphs, string handling, parsing, programme options etc. from boost.org which every c++ programmer should be familiar with.
These rules are meant to be broken but whenever they are, c++ programmers should understand that "Here be dragons" is writ large all over and should manage the risks and tradeoffs accordingly.
This is very different from what c++ programming was like when I started in the 1990s and many self-styled "gurus" and programmers haven't caught up with this new safer, simpler, library-driven approach yet.
The power of c++ should be used to write libraries with clear abstraction, great flexibility and extreme performance (again see the examples at boost). The user of those libraries should not have to worry about these issues any more than a driver of a car has to worry about fuel injection technology.
There are still many deficiencies within c++: the lack of a really well designed threading library, error messages can be truely awful even for simple errors, the lack of clean syntax for iterating over a range. There and other problems look like they will be taken care of at the next round of standardisation being completed so things are on the up.
Modern c++ is getting more powerful so that ordinary users can write simpler cleaner code.
If your own c++ code is in a mess, maybe you are being too clever for your own good?
Leo
bs on 12/05/2006 at 8:29 PM
6
Obviously, C++ isn't perfect. Neither are any of the other languages, of course. C++0x should deal with some of the very reasonable points you mention towards the end: there will (finally) be a standard threads library (hopefully well designed), a garbage collector (which can be optionally enabled) will be part of every implementation (saving people the bother of downloading one), and the spectacularly bad error messages that you can get from using templates will be eliminated through the use of "concepts". For example, see http://www.artima.com/cppsource/cpp0x.html or search for C++0x on the web.
kellin on 12/05/2006 at 11:24 AM
1
Your comment is very fallacious. Being able to discuss the finer points of C++ has nothing to do with how many years of experience you have or if you hava Ph.D. It doesn't even have anything to do with whether you're a bookworm or a nerd or a geek. It has everything to do with whether you learn, or you want a quick fix.
C++ is not a quick fix language - you can't learn it for a specific purpose and expect to understand all it's idiosyncracies. You can learn the aspect you wanted to learn and that's about it. Unless you want to learn the langauge in general in which case it's really easy to assimilate. Granted, the ARM is a little longer than the C book, but only because it's annotated. But then again, if you understand what the heck you're reading you should be able to quote parts of the ARM. It's VERY straightforward.
Most of the other languages you see are more forgiving. You can learn one aspect of the language and it easily segues into other parts, and soon enough, but doing enough, you eventually know the rest. C++ is not designed this way. It's partitioned for specific uses. You can only know what you seek to learn.
Oh, by the way. In school I was part of pretty much every club, was one of the star athletes (soccer, volleyball, tennis, track), and was the most popular boy in school. So much for your Ph.D. theory.
One example, does not a generalization make. But it does realize one.
Cheers
gabrielg01 on 11/28/2006 at 9:17 PM
294