Compy Ed: Blogging about Computer Science Education

Sunday, February 11, 2007

Developing Software

Filed under: computer science, education, politics — compyed @ 6:37 pm

Most intro programming courses are about programming and not much about developing software.  Thus, the focus is often spent on covering the syntax of the language (and sometimes not even that, as teachers are often happy saying students can look up the syntax–strange how English teachers never say that) and on solving simple problems.

To be fair, we all have to start somewhere when we learn how to program, but many teachers of programming are only familiar with programming in the small, how to solve tiny problems.

I once read Donald Knuth talking about computer science.  To him, computer science was not about programming, per se, it was about thinking algorithmically.  But he’s wrong, or at least, partly so.  We can solve some algorithms isolated from everything else, but really, people usually program much large problems.  We’re not interested in quicksort by itself.  We’re interested in solving a problem which has a tiny component requiring sorting.

The hardest part of developing software, other than solving the problem at hand, is deciding what should be solved and evaluating whether the solution, given time and other constraints, makes sense or not.

In principle, this is what software engineering purports to teach.  While I realize that teaching software engineering to students who don’t know how to program is analogous to talking about plot and character development before someone understands the grammar of a language, it’s still important enough to consider.

Here’s an exercise for you to try.  Ask yourself what you think a good software engineer needs to learn.  Then, pretend you are chair of a department with a good deal of power.  Decide which courses students should take, and which courses will cover the content needed for them to succeed.  Compare that with your current curriculum and see how well it aligns.

Software development is a work in progress, and as such, teachers of programming need to understand where the state of the industry is, and decide whether what we teach can be better aligned to what the industry needs.

Saturday, February 10, 2007

What Do Computer Science Majors Need?

Filed under: computer science, education — compyed @ 9:06 pm

Computer science is a relatively young major. It’s mostly been in existence since 1980, maybe 10 years more in a few universities. How does one go about creating a new major? Presumably, you follow what’s worked before. And yet that may make little sense for computer science.

Since computer science uses the word “science”, it makes some sense to follow what’s been done in other sciences. In particular, many sciences have labs. The reason is simple. How many people have chemicals in their dorms? Or frogs? Or a voltmeter? This equipment is not only expensive, but some of it is downright dangerous. It makes sense to keep this equipment in a lab, and let students use it with supervision.

This is the model some universities took to teach computer science. At one point, this made some sense too. Computers were, at one point, very expensive, costing 2000 dollars or more. The earliest Macs were nearly 2500 dollars. And this was in 1984! This was a significant investment by parents, even if they were savvy enough to realize that having a computer might help students move ahead. Some PCs were above 3000 dollars or more.

Universities bore the brunt of this cost for many years, and it didn’t become really affordable until about 1995, when thousand dollar computers became more commonplace. While a thousand dollars is still pricey, realize 15 years had passed, and people were that much more wealthy, and so this represented a smaller chunk of change than it did in 1980.

Still, universities often lag, not looking at the world outside their ivory towers, and keeping to the same standards that they have kept for years and years and years. Do labs even make sense? Many universities now assign projects outside of class, expecting students to devote 10-20 hours (or more!) a week to their programming assignments. These are times far in excees of a typical 3 hour lab, but the experience gained from this is invaluable.

I have no numbers, but I believe the initial steps of learning how to program have a steep learning curve. It requires, even among the best programmers, some time to make mistakes, work through debugging, and so forth, to get proficient. Much like riding a bike, you don’t usually get it right away.

To get to the point, I want to say that the lab was created at some universities for computer science majors because they used a tried-and-true principle. Copy what seems to work. Labs seemed to make sense in the early days of computer science, but make a lot less sense now, because most students own their own computers. It’s sad that such an investment needs to be made to be successful as a computer science major, but these days, that’s what it takes.

But, there’s another more profound issue. How do we teach computer science majors how to program? Many computer science professors were not truly programmers. They were mathematicians. They believe the pinnacle of computer science is solving hard problems, typically involving math. Anything that appears to require brute force coding is seen as mere engineering, meant for grunt workers.

Of course, most of this was a defensive reaction, in an effort to avoid learning how to program. The problem is that these people are also the same folks that are teaching our kids to program. Perhaps no other major has those who teach so unwilling to learn what they are teaching. Fortunately, this is changing. As new professors come in, they have been trained to deal with the frustrations of programming. And yes, programming in frustrating.

Now, teachers are supposed to be teaching, thus sharing some of the student’s pains. They decide that some aspects of programming are indeed far too painful, for too menial, so they use their superior skills at programming, and hide details from the students. Students therefore don’t install compilers, they don’t deal with bad documentation, they don’t have to upgrade their software, and so forth.

All of this is perfectly fair. If you introduce all these challenges too early on, students will lack the skills to succeed, when they think someone else, someone smarter, will take care of these difficulties for them. Computer science is strange that way. The average computer science major must pick up a lot of stuff about computers on their own, everything from what is a USB port, what is Ethernet, what is HTML, what is CSS, to some basics of installing software. These are things that are not taught in computer science because they are deemed to be unimportant, and more critically, to be transient. The difficulties of today may not help you solve the difficulties of tomorrow. When HDMI replaces component, when USB replaces RS-232 or serial cables, all this technology simply becomes obsolete mere years after they were first introduced. Can you even get a laptop that will let you hook up to your phone line anymore?

But all of this is hidden from students. Why? For one, maybe the people teaching it lack this knowledge. For another, it helps the weaker students get by. Yet, is it enough for them to make a living? Ask yourself when you teach what does a programmer need to be successful?

In the end, the answer is not a specific skill like programming in Java, but the willingness and ability to learn despite poor resources to do so.

These days, I can point to the following skills that students should have, but probably don’t.

  • Debugging using a debugger (and how to do so)
  • Version control, and how it works
  • Related to that, branching/merging
  • How to profile both time and memory
  • How to learn to use a new piece of software or a library
  • How to design a solution and weigh pros and cons of that solution
  • How to use a bug tracking system. How to report a bug.
  • How to test.

Most of this is rarely taught. If a student is lucky, it’s covered in a decent software engineering program.

Ask yourself, what does a computer science major need. The list may be longer than you think.

Monday, January 15, 2007

New Programming Language?

Filed under: computer science, education — compyed @ 3:05 am

Now that I’ve been working in the industry for a while, there are a few things on my wish list for programming languages.  These aren’t terribly obvious things.

I believe academic programming languages need built in profiling, both memory and runtime performance.  I know, I know.  If students can’t get their heads around basic programming, then why bother with these topics?

Still, in the real world, you need these features, and yet, they are not designed from the beginning.  If they were, I think you’d find a lot of cool things about your program, in very practical ways.  I know languages like Java like to hide how much memory is used, but to be fair, so does C++.  Indeed, finding how much memory objects use is hardly a straight-forward task.  If such monitoring were easily available, say, in a debug mode, you’d be able to point this out to students.

I also believe that languages need to separate out printing and debugging messages.  Java does this, somewhat, by letting System.out.println be for standard printing and System.err.println be for error printing.  Of course, most people say you should use logging, which, for my money, still needs to be greatly simplified, at least, when you look at the log4j model.

I suppose, if polled, you’d find most people wanting to teach some functional language or some such, to allow for greater semantics, but always near and dear to a pragmatic programmer’s heart is how fast things run, and how much memory it consumes.

Friday, January 5, 2007

Learning How To Use an Editor

Filed under: computer science, education — compyed @ 11:24 pm

I’m an emacs guy. I’ve been using emacs now for, I dunno, maybe 15 years. I can’t say I know everything about emacs. I don’t usually use grep within emacs, nor write e-lisp, nor debug in emacs.

Because I learned emacs, I never really learned vi. I know enough to quit out of vi, and that’s it. I knew a friend who started in vi, and was quite proficient. He forced himself to learn emacs, though, and while he liked emacs and could appreciate its power, there were things he could still do faster in vi than emacs. Each editor had its strength.

vi, in case you don’t know, is the default UNIX editor. In the past few years, it’s been replaced by the more powerful vim.

vi/vim are modal editors, meaning sometimes you are entering commands, and sometimes you are entering text. emacs, on the other hand, is not-modal. It uses control characters to maneuver around the keyboard, where vi uses the letter h, j, k, l (when it is not in text mode) instead of cursor arrows.

Now, this takes the new user a long time to learn, and initially, they can’t stand it. It’s too much work. But if you learn patience, you can eventually learn it. For example, I wanted to learn (once upon a time), how to use Windows shortcuts. I didn’t know how to cut and paste in Windows, thinking it should use the emacs model.

But I forced myself to learn, to work hard at it, so it became second nature.

It’s this kind of unintuitive learning that we do all the time in the computer field. It’s like a Westerner learning to use chopsticks. It requires a great deal of patience, but eventually, you get adept at it.

There’s really two aspects of learning an editor. First, especially for editors with steep learning curves like vi/vim, you need to get to the point where you don’t think about the editing actions explicitly. That itself can be hard. Second, you need to learn the various features. Many students live with a tiny subset of what is actually possible. If they would devote a bit more time, they would become proficient at it.

I suggest the way to teach this is through some kind of “game”. Perhaps you give various levels to proficiency, like they do in martial arts. Of course, this may only appeal to those that like competition, but still, it’s more focused than forcing people to learn for its own sake (even if a game is more or less just that, except it appeals to people’s competitive drive).

Although I’m talking about an editor, I could be talking about any program that students use when coding.  It takes time to learn and become proficient, and even more to become an expert.  So many choose not to become an expert because of how much work they perceive it takes.

Once you know you’re willing to do this work, you’ll find it easier to learn.

Sunday, November 19, 2006

On Education Blogs

Filed under: computer science, education, politics — compyed @ 2:39 pm

I just caught something in the Washington Post.  Jay Mathews, a writer for the Post, has a column called Class Struggle, which delves into education issues.  Recently, he asked what the best education blogs are.  Now, I can’t say I read Mathews columns with any sort of regularity.  Indeed, I only recall reading it once or twice.

But I would have to imagine that education blogs are about the education business, with topics like vouchers, educating in poor neighborhoods, and so forth, being common topics.  The actual act of educating is not discussed much, which is rather amazing.  The closest analogy I can think of this is a newsgrou, which I recall was devoted to fitness and bodybuilding.  However, most of the discussion centered on performance enhancing drugs, which, while relevant, seemed to miss the central point.

Similar, pedagogy seems to be a topic that’s rarely discussed.  Articles dance around the act of educating, and focus on the infrastructure of education.

I wonder how hard it would be to simply blog about education and educating.

A Matter of Taste

Filed under: computer science, education — compyed @ 1:01 am

Hi folks. Sorry I haven’t written in a while.

Many books teach programming by teaching syntax and control flow. After all, programmers do need to learn this.

However, it’s not everything. As time passes, I think we’ll learn to see programming like writing, even if some of us would like to believe that programming is like mathematics, just cranking away at a problem until we have an answer.

Programming is far more creative than that. I’d liken it to architecture. A reasonable problem for an architect is to design a building with some constraints. Beyond that, there’s a great deal of room for creativity.

But really, I should liken it to writing a story. A good story makes decisions about character, plot, but even down to word choice, allusion, and so forth. Many decisions are made to keep the reader engaged.

The criteria in programming is different. There, maintainability, extensibility, ease of testing, ease of reading, are all factors that affect coding. The folks who are into refactoring say that code has bad smells. This community is perhaps the first to make it a point to explain that coding is more than simply getting something to work, but getting it to work right.

I was reading a book about the history of astronomy. In particular, it was once believed that the Earth was the center of the universe, and everything revolved around it. This made us feel important. Yet, there were planets, in particular Mars, that would appear to move backwards then forwards. Some astronomers, unwilling to give up that planets had circular orbits, imagined these planets travelled in circles upon circles. It was complex, but it allowed circles, considered perfect, to still exist.

While others began to hypothesize that the sun was at the center, then to speculate that the planets traversed in ellipses, not circles, some clung to elaborately complex solutions.

Both notions exist in programming. On the one hand, people have elegant and ornate code that is perhaps too complex for the problem at hand, while others are far simpler, far easier to follow.

Both camps in astronomy had a view of the universe. One sought perfections in circles and an Earth-centered universe, even if it meant making circles upon circles, which were hideously complex, while the other sought the simplest explanation that fit the facts.

Programmers sometimes lack these worldviews. They’re concerned with getting stuff to work, and don’t think about how good or bad their code is.

As teachers of programming, we need to discuss quality of code more. I read code and sometimes feel that it’s not the right way to code it up, even if it behaves correctly. I don’t mean silly stuff like whether we should use camel case or not, or whether we should use parallel braces or not. I’m talking about how to pick public methods, and whether to use interfaces or not.

For example, because of Java, I’ve become a huge fan of interfaces. I use them everywhere, even in C++, because I think they buy you flexibility. Over time, I’ve discovered they also buy you decoupling of code. Code now depends on interface, rather than the implementation. This comes at a cost, though I believe it acceptable. The cost is the method must be looked up at runtime. This is called dynamic dispatch (although, frankly, I’d rather call it runtime method lookup). This is slower than knowing the method call at compile time.

In Java, this is not so problematic as you have no choice but to use runtime method lookup. In C++, you can avoid this if the method is not virtual.

In any case, as one develops as a programmer, one builds taste in what is good and what is not, and this is something we need to start worrying about. But it can be quite difficult, because unlike writers and English majors, we’re not trained to evaluate code, whereas English classes always discuss writing. Even if it’s subjective, it’s not completely random. You build up experience deciding what is good and bad, and while there is no consensus, people can often judge one writing relative to another.

In programming, two criteria exist. One is speed. This used to be of overriding concern for those who loved C. All sorts of tricks were done to improve speed, at the expense of pretty much everything else. The other is maintainability, extensibility, readability. I tend to fall in the second camp.

This issue is more complex than I have time to explain because I should show a few examples. But hopefully, it’ll give you something to think about.

Saturday, September 30, 2006

Who To Teach To?

Filed under: computer science, education — compyed @ 3:34 pm

As a teacher, you have to decide how challenging a course ought to be.  I recall, once, when a non-faculty teacher was teaching a senior data structures course.  Notice I said senior.  Most universities teach data structures to first or second year students, not fourth year students.  Indeed, if you pick up a data structures book meant for college use, you see data structures such as stack, queues, heaps, binary trees, and perhaps one or two complex balanced tree structures (red-black trees, AVL trees).

This teacher thought he was teaching at this level, so he assigned a binary search tree as a project.  Part of the problem was that he didn’t seem to consult anyone about what level to teach this class.  As traditionally taught, this senior level data structures course covered advanced data structures such as B trees, but also spatial data structures (meant to store coordinates, lines, etc. in 2D and 3D).  There are other complex structures, such as Fibonacci heaps, and even data structures so complex that algortihms researchers generally talk about the data structure, rather than implement it.

There’s some level to go up, if you want to get really complex.

The teacher’s mistake, one of not consulting those who taught the course, was a problem. It’s certainly possible, given the unique nature of this course, that he didn’t know about these other data structures, and either was embarassed over the fact, or, even as a Ph.D. unwilling to learn the nuances of the data structures and implement it.  After all, it’s so much easier to teach something you already know, even if it’s been a while since you’ve seen it, than to teach something you have to learn.

More than likely, if you’re teaching for the first time, you’ve been given a book and a syllabus from the course taught before.  You can look at exams to get a sense of the level of difficulty, then decide how difficult the course should be.

Difficulty can be measured in different ways.  For example, you can make the course “easy”, at least conceptually, but be a hard grader.  You can make the course intellectually challenging, but make the grading easy (or hard).  You can make the programming aspects lengthy, requiring thousands of lines of code, or shorter, but requiring some advanced knowledge.

All of this can be put under the umbrella of “who should you teach to”?  The best students in the class?  The average student?  (Rarely, does a teacher say they will teach so even the worst students will understand).  Perhaps take a two-tiered approach?  That is, teach some things everyone should learn, but a few things only the best should learn?

Some people prefer to teach to the very smart.  It has one huge benefit.  It takes less work.   One could argue that it’s the mark of a good teacher to make a poor student learn.  However, many teachers don’t aspire to be that good.  It takes too much work.  Imagine you’re presenting the best explanation you can possibly think of.  And still someone says “I don’t get it”.   That has to be deflating, as you feel you’ve done your best, and still, it’s not simple enough.

Teaching to the smart has another benefit.  They’re the ones most likely to use that information well.  Even so, it can be hard to teach well, even to smart people.  You can cover a complex topic in some abstruse way.  Sure, the best students will work hard to learn it, but it doesn’t mean you’ve done a good job.

In the US, you can get various kitchen tools, such as can openers, peelers, ice cream scoops, and so forth, by a company named Oxo.  Oxo makes grips that are largish, round, and rubbery.  These tools were originally aimed at those with arthritis who wanted bigger handles.  However, even those who were perfectly fine liked these oversized handles.  Without the non-elderly, non-arthritic majority, Oxo would not have been as successful as they have been.

Similarly, if you teach well even to the average student, it can benefit the best student to.

My suggestion is to teach both to the average and to the best student.  Doing so may be difficult.  Some people have no idea how to teach to a really good student, not being really an expert.  Some have no idea what average means.

Indeed, that’s my last point to make for this entry.  What is average and excellent depends very much where you are.   Teaching to the average at one university may be just fine, yet too advanced for another, and far too simple for another.  You have to learn to adjust your teaching to your audience, which means you need to know your audience.

Thus, I leave you with this thought.  Figure out your audience.  Figure out who you want to teach to.

What is Programming?

Filed under: computer science, education — compyed @ 2:08 pm

Once upon a time, nearly twenty years ago, Pascal was the most commonly taught language in high school and colleges. It replaced older languages like Fortran and more obscure languages.

I haven’t programmed in Pascal in quite a while, so you’ll have to forgive me if my recollection is a bit rusty. Pascal is a procedural language. There was no notion of objects with Pascal. In that sense, it resembled C.

Pascal had a few features that C didn’t have and vice versa. For example, in Pascal, there are two ways to pass a value to a function. Either you pass by value (a copy of the value is made) or you pass by reference (essentially, a pointer to the value is sent). Passing by reference in Pascal is similar to C++.

C exposes pointers everywhere. You can take the address of practically anything in C. In Pascal, you can only use “new” (or whatever the equivalent was in Pascal) to create dynamically allocated structures. You couldn’t take the address of anything.

Pascal allowed for functions nested inside other functions. If function Inside was nested inside of function Outside, then Inside has access to Outside’s parameters. Only Outside could make calls to the nested functions. They were not accessible by functions outside of Outside.

Pascal makes a distinction between a procedure and a function. In C, a procedure would be called a void function, i.e., a function that does not return a value. A function returns a value.

In C, for a function to return a value, you use the return statement, and return the value back with it. In Pascal, there’s a special variable named after the function, having the return type of the function. You assign this special variable to the return value. Unlike C, where the code of the function is complete once the return statement is run, Pascal runs the code until the end of the function block, even if you have already assigned the function variable.

To illustrate, suppose you have a function foo. This function is supposed to return an integer value. At some point, you’d be expected to assign the variable foo inside the function to an integer. In principle, you could assign it several times. Whatever value the variable has when the function block is complete, that’s the return value of the function.

Pascal was designed by Niklaus Wirth to be a teaching language, not meant to be used in the “real world”.  Perhaps its chief deficiency is the lack of separate compilation (though later versions may have fixed that).  This meant everything had to be in one file.  I suppose it may have had a limited library (thus stabilizing the language).

In fact, the key feature of a teaching language is one that doesn’t change because industry demands this feature or that.

But let me not delve too deeply into Pascal.  What I want to get at is what we think programming is.

When I was teaching programming, we spent a good deal of time talking about control flow, arrays, variables, functions, and pointers (we taught C).  Notice one topic I didn’t cover deeply, and the one that might be of greatest controversy is algorithms.

Why didn’t we cover (much) algorithms?  I was teaching at a public university in a major that was non-selective.  Anyone could be a computer science major.  And, unlike engineers, these majors require far less math to graduate.  We required two semesters of calculus (or was it three?), a semester of stats, and a semester of linear algebra.  Even so, all you need is a grade of C to pass, and that isn’t a particularly high standard (to be fair, there are excellent programmers that are awful at math).

Algorithms require some mathematical maturity to properly understand them.   Many Ph.Ds have been so good at math for so long that they can sometimes scarcely believe that some folks are bad at it, or that those that are bad at it would even contemplate majoring in computer science.   Yet, many students can become good programmers without math (I personally believe you can’t be great without some knowledge of math, however).

Some intro teachers focus on problem solving and algorithms rather than features of the language, believing that this transcends specific details of programming languages.  In other words, you simply have to think algorithmically, not in any particular language (though resembling Pascal, perhaps).  The niggling details of this language or that is merely a distraction, preventing you from solving a problem.

This is why many a professor has stopped programming for real.   Those details are a pain.  And why bother commiting them to memory.  The proof for the undecidability of the halting problem is so much more elegant than the mundane overloading of the term static in C/C++, which can mean “having file scope”, “having static storage”, “being a class variable”, depending on where this keyword is applied, or remembering that a compare operator doesn’t always return -1, 0, 1 (instead returning negative, zero, and positive).  After all, there are no syntax errors in proofs.  You can write in stylized Math English, and have a great deal of latitude in the proof.

Still, these represent the two competing (possibly among many) ideas in teaching programming.  In the one camp, which is the majority, teachers say syntax is irrelevant.  The details of the syntax can be looked up, so look them up when needed.  On the other, syntax is very important, because how many people are really going to look things up (not many, in my experience), but then syntax is often the first thing students forget.

Let’s compare this to English writing.  Many a English teacher are grammar sticklers.  There is a right way to write.  There is a wrong way.  Bad grammar is seen as an offense to man and God.   However, more recently, some have advocated getting students to get their ideas across coherently, regardless of grammar.  Getting students to write is more important than the niggling details of the language.

There is something to be said about this approach.  Grammar is difficult to master.  I recall a professor, who wanted to become chair of the EE department, who complained about foreign students and their lack of good English.  He felt foreign students needed to be good at English (even as many native speakers are bad at it).  I would guess that he had no good idea how to teach good English, nor how hard it is to master.  I’d have more respect (but only barely), if he’d attempt to learn the grammar of another language (preferably Asian), and see just how challenging it is.  Naturally, this person assumed that someone else (the student) would take care of this problem.  If they had truly cared for foreign students to get good at grammar, they might have either hired someone to help out, or simply make acceptance into the program based on grammar.

The point is that English teachers used to stress the importance of grammar, but computer programming teachers somehow dislike this aspect of programming.  And I think, as onerous as syntax is, it insulates students from real world programming.

The question of this essay is to ask what is programming, because once we have a semblance of answer to this question, we can then ask, how do we best teach programming?  How do newcomers to programming fail to understand the ideas of programming?  How is one beginner different from another?

In hindsight, when I was teaching programming, I stressed syntax.  Occasionally, I discussed problem solving, but honestly, not that much.   That was a mistake.  I needed to strike a balance between the two.   Indeed, even if a student never learns how to program “properly”, learning to solve problems in an algorithmic way is still of value.

Sunday, September 17, 2006

Has CS1 and CS2 become obsolete?

Filed under: computer science, education — compyed @ 12:26 am

CS1 and CS2 are the traditional names given to the first and second courses of programming.  Even if that’s not the name your university or college use, it’s a shorthand name.

It was once thought that two semesters were all that’s required to master the basics of programming.  CS1 would cover the basics of the language: if statements, loops, functions, structures.  CS2 would cover basic data structures: linked lists, stacks, queues, binary search trees.

Although this was a perfectly good way to teach programming when the language was Pascal, it’s increasingly obvious that it’s not enough for any object-oriented programming language.  Indeed, I’d argue that we need four semesters of programming, rather than two.  Object oriented programming is that tough.
This is just another indication of CS faculty not properly paying attention to what they’re teaching.  Of course, there was a pragmatic reason to keep to two semesters instead of four.  That’s two additional courses that don’t need staffing.  And believe me, staffing is a big issue.

There’s many an academic professor who don’t want to teach introductory courses.  Too many students sign up for programming that don’t really have a head for programming, and the teachers often lack the patience when students don’t understand basic concepts like array indexing.  When professors want students to use continuations and lambda expressions, the thought that simple loops can confound a student is scary.

So imagine the horror if they had to teach not just one, not just two, but four courses.

Of course, you’re now waiting with bated breath, wondering what four courses I think are important to teach.   It’s not terribly surprising.  First course is OO programming one.  This covers basics objects, loops, functions, etc.  Basically, object-based programming.  Not so much inheritance.  Second course is OO programming two.  This would include inheritance.  It would also discuss some basic design patterns.  Course three would be data structures, and some basic software engineering.  Course four would be design and software engineering.

When I taught at Maryland, we had three courses.  First course was C.  Second course was C++.  Third course was C++.  Even with three courses, the drop/fail rate was pretty high.  Since then, they’ve changed the curriculum to two courses in Java, and a third course in C.  The C course was set up to teach systems level programming (low-level programming).  My four courses admittedly misses that.

I found it interesting that Maryland would teach three semesters of introductory programming taking fifteen weeks per semester (or 45 weeks total), while the University of Washington, on a quarter system, would take two quarters (or about 20 weeks total) to cover traditional CS 1 and CS 2.  Maryland spent more than twice as long to cover intro programming.

How did UW manage to avoid awful programmers in their system?  Simple.  They have a selective major that removes more than half the eligible computer science majors.  When you’re willing to teach 200, but only want 70-80 majors, those who survive the cut are likely to be good, no matter what you did in the first courses.  It would be seriously scary if 140 students were allowed through, as it is in Maryland.

Imagine if every major did that.  Many students would simply not graduate.  Of course, computer science professors (and companies recruiting top students/programmers) love this because it means you don’t have people struggling with the basics of the language when you want to address math or some higher level concepts in operating systems.

Ultimately, where we’re failing students is our ability to give appropriate feedback, and large universities are worse than small ones.  English departments, for example, value feedback so much that they force small sizes when they teach.  At one point, I had to teach a course where each grader had to grade nearly 100 students.  Clearly, they had no time to look at their projects to see if it was any good.  They had to check the basics, and move on.  Our resources were that limited, and the department did not see it as a travesty.  English professors and graduate students would have been up in arms.   We were not.

We need to adopt a model that’s closer to what the English department (and similar majors) do when it comes to teaching coding.  It’s amazing that good coders develop, even in a system that doesn’t necessarily evaluate good vs. bad coding.

One reason I decided to be a software developer was to see coding in a real environment, and see how development was done (at least, someplace).  It’s been an eye-opener, especially during code reviews, and it’s something that computer science departments need to adopt.

Do We Need a New Pascal?

Filed under: computer science, education — compyed @ 12:09 am

I just read a short article about “Why Johnny Can’t Code” which, in a nutshell, asks what an appropriate language for beginners is. Once upon a time, it was Pascal. Lots of people liked Pascal, and many still do. They don’t like the monstrosity of C++ or Java or C#. They’ve concluded the object-oriented programming is too tough to teach.

Now, I have a bias. Since I taught at the college level, I think of computer science education at that level. Given the title of my blog, however, I should really consider programming at high school and even earlier. How do we teach programmers to kids? Many kids learn to program using a language called Logo. Logo was pretty much a cursor, which you gave commands to, and it drew lines. It was visual, which gave plenty of feedback to the programmer. It didn’t do anything particularly useful, but the ideas behind logo (looping, conditions) are valuable to programmers.

These days, people want programming languages that allow users to do something interactive with the user, preferably of a graphical nature. This could be a GUI, but more likely is a game, or some interactive world.

Why do teachers want a new Pascal? Even at the college level? What’s wrong with C++ or Java?

Many programming teachers learned an imperative programming language like C or Pascal. It’s taken quite a few years for teachers to learn how to think in object oriented terms.

I’ll relate a story. I was teaching with someone who was bright (at least, enough to get a Ph.D.), but clearly her background was imperative programming. How so? She developed a project, where she wanted the students to break this complex task into three separate subtasks. Each of those subtasks were public methods. They had to be called in sequence. Call subtask A, then subtask B, then subtask C.

It was a good idea to break down the tasks into subtasks. It’s one of the oldest teaching ideas in programming: stepwise refinement (seems like we’ve always given fancy names to simple ideas). But it’s an awful idea in object oriented programming. In general, any class that requires you to sequence method calls is problematic. It forces an order that’s hard to check for in the program. For example, even the common programming idiom of opening and closing a file is a pain.

Ideally, you want the constructor to open a file, and the destructor to close it. But you may not want to open the file right away. Simple. Create two classes. One class, when it calls open() will return a second object. That object, when it gets destructed, will close the file.

But my point that a relatively bright person still thought imperatively, not in object-oriented terms, even though she was familiar with C++ syntax.

It’s been argued that it’s too hard to teach in a “objects first” approach. What is objects first? Many people teach C++ by teaching C first. Learn how to program imperatively, then learn how to program using good OO principles. The reasoning behind this was obvious, yet flawed. C++ came from C, thus object oriented programming comes from imperative programming. As big an authority as Bjarne Stroustrup argues that you can learn C++ directly.

Having said that, it’s tough to say whether Bjarne is right. After all, Bjarne’s a bright fellow, and we’re talking about teaching programming to the masses. It may even be that you can’t teach programming to the masses. It’s like teaching math to the masses. It may be that programming is at least as hard to learn as algebra, and many an American, despite taking algebra, are barely competent, years later, to do algebra.

I don’t know if teaching objects first would work or not. Many do try, and have the benefit that those who don’t get it can seek other majors. I’d like to try it myself at some point, just to see what would happen.

The reason there are objects-first advocates is Java. Java makes it very painful to write in a C-like style. While Java has a main() method, this is the primary remnant of C/C++. To write in a C-like style, you’d have to teach static methods and static variables early on.

Then, once you do that, you’d have to tell students not to use static methods and static variables, because no sane Java programmer codes in this style. Rather than teach this style, then have to convince students this is nuts, you have to teach objects early on. Even advocates of objects later still have to deal with objects really early on, as there’s objects everywhere (Strings are objects, for heaven’s sake).

One reason for teaching Java or C++ in college is that it builds a skill that’s directly usable in the real world. Well, almost. Java is evolving. They just came out with Java 1.5, with generics as the primary new feature. The nice thing about Pascal? It’s fixed. No one was adding new features, because there was no need. It was a teaching language. Sure, it lacked separate compilation, and certainly wasn’t OO, but no one was going to fix the language in major ways.

Even now, many a Java educator wants to teach a controlled version of Java, that limits what students need to know, and they would like to fix this version of Java so that students, and more importantly, teachers, don’t have to keep up with the latest changes in Java.

And that’s what this blog entry is finally all about. Why are teachers reluctant to teach real languages? Because real languages evolve, and many teachers aren’t thrilled that their knowledge of a programming language is going to go obsolete if they do nothing more. Which other disciplines does this affect? Math? Nope. English? Only very slowly. Many fields, once you learn it, that’s it. There’s no strong need to learn more, though you certainly can, and students benefit from it.

Programming? There are universities that won’t switch over to Java because the professors don’t want to learn the language. They spent time learning C++, and the languages don’t seem that different, so why bother?

A teaching language is clean; it avoids lots of details you may not care about to explain, that real languages have to deal with. When you have to learn minutae of the languages that seem to have no other purpose other than to confound you, you wonder why it should be taught in the first place.

But unfortunately, as nice as that is for teaching purposes, it avoids reality, and the reality is that real languages are not always so clean. Most teachers are content to say “They have to start somewhere, and I’d rather start them on a nice clean language”. Which means to say they don’t ever want to teach a language that’s grungy. We must do it for the children. The children!

But of course, the reasons are partly to insulate the teacher, who doesn’t want to teach these details, and partly because programming is simply hard. A programming teacher, at least, those who aren’t bitterly cynical, believes everyone can program. As they teach, they see some students struggle. Some struggly mightily. They begin to say “It’s not important to learn syntax! Look it up! The conditional operator is a nasty feature. Let’s not cover it!”.

Eventually, this teaches all sorts of bad habits to programmers. The lesson programming apprentices should learn is that experts learn the nooks and crannies of the language, the parts that may seem obscure. For example, continuations are a big feature of Scheme. To avoid teaching it is to avoid teaching a critical part of Scheme (well, maybe not “critical”). Great programmers learn the obscure features of a language, even it’s not widely familiar.
Having said that, I’ve been coding in C++ again, with STL, and find that I must think about const a lot, and how to use iterators, and all sorts of mess that I don’t have to deal with in Java, because I understand Java’s classes much better, and it makes a lot more sense to me. That may simply be familiarity, but the time I waste trying to think about const makes me that much less productive, despite my acknowledgement that const is an important issue which all C++ programmers need to know. (Again, my colleague found const correctness a pain to deal with and wanted to abolish it).

There are even more radical ideas (radical to me, anyway). There are universities that want to ditch strongly typed languages in favor of weakly typed languages like Python and Ruby. These languages often have powerful string methods and regular expression support. Powerful code can be written in a few lines.

I do see some advantage in this. I think I’d write a lot more useful tiny programs if I were more well-versed in Python or Ruby. I find writing some programs in Java or C++ really painful that would only take a few lines in these languages. But honestly, they should have some strongly typed subset of these languages so that there can be some error checking at compile time. Please. (Groovy, a scripting language, built on top of Java, and compiles to Java classes, seems like something that might fit the bill).

So I can see both sides of the issue.  On the one hand, all the plethora of features makes it difficult to learn a real-world language.  It’s hard on the teacher.  It’s hard on the student.  On the other hand, it’s also polite fiction.  If students are expected to learn real-world programming, they need to worry about all of this perceived nastiness.  Teachers want students to learn less (i.e., stick to the basics), while the real-world wants them to learn more.

It may be, in the distant future, the teachers will be right, but I doubt it.  Here’s one example. Internationalization.  Java had to switch to Unicode to deal with international character sets.  How many teachers really want to explain, to American students versed in English, anything about internationalization?  None, I’d imagine.  It’s hard enough to teach programming (loops, functions, inheritance) without having to even mention internationalization.  Yet, it’s a topic of concern for many developers.

Software development is getting more mature, but in the process, it’s making the teaching of computer science more challenging.  I can’t say I honestly know the answer about what language to teach first, only to say that the reasons teachers push students in that direction may be at odds with the real world.

Next Page »

Blog at WordPress.com.