开发者

Why do we pass by reference when we have a choice to make the variable external?

Suppose we have an array say:

int arr[1000];

and I have a function that works on that array say:

void Func(void);

Why would there ever be a开发者_开发技巧 need to pass by reference (by changing the void), when I can have arr[1000] as an external variable outside main()?

  1. What is the difference?Is there any difference?
  2. Why do people prefer passing by reference rather than making it external? (I myself think that making it external is easier).


If you use a global variable arr, Func is limited to always being used with that one variable and nothing else. Here are some reasons why that might be bad:

  • arr is part of the "current document" you're working with, and you later decide you want your program to support having more than one document open.
  • You later decide (or someone using your code as a library decides) to use threads, and suddenly your program randomly crashes when two threads clobber each other's work in arr.
  • You later decide to make your code a library, and now it makes sense for the caller (in case there's more than one point at which the library gets used in a program) to provide the buffer; otherwise independent parts of the calling code would have the be aware of one another's implementations.

All of these problems go away as soon as you eliminate global variables and make your functions take pointers to the data they need to operate on.


I think you're asking if global variables are bad. Quoting an excellent answer:

The problem with global variables is that since every function has access to these, it becomes increasingly hard to figure out which functions actually read and write these variables.

To understand how the application works, you pretty much have to take into account every function which modifies the global state. That can be done, but as the application grows it will get harder to the point of being virtually impossible (or at least a complete waste of time).

If you don't rely on global variables, you can pass state around between different functions as needed. That way you stand a much better chance of understanding what each function does, as you don't need to take the global state into account.


If arr is external then anyone can modify it, not just Func. This is Officially Bad.

Passing arguments ensures that you know what data you are changing and who is changing it.

EDIT: Where Officially Bad means "Usually bad, but not always. Generally don't do it unless you have a good reason." Just like all the other "rules" of software development :)


By making the variable external to the function, the function is now tightly coupled to the module that defines the variable, and is thus harder to reuse in other programs. It also means that your function can only ever work on that one array, which limits the function's flexibility. Suppose one day your requirements change, and now you have to process multiple arrays with Func.

By passing the array as a parameter (along with the array size), the function becomes more easily decoupled from the module using it (meaning it can be more easily used by other programs/modules), and you can now use the function to process more than one array.

From a general code maintenance standpoint, it's best that functions and their callers communicate through parameters and return values rather than rely on shared variables.


It's largely a matter of scope; If you make all your variables external/global in scope, how confusing is that going to get?

Not only that, but you'll have a large number of variables that simply do not need to exist at any given time. Passing function arguments around instead of having lots of global variables lets you more easily get rid of things you no longer need.


Passing by reference (rather than using a global variable) makes it more clear to someone reading the code that the function may change the values of the array.

Additionally if you were to want to preform the action on more than one array you could just use the same function over and over and pass a different array to it each time.

Another reason is that when writing multi-threaded code you usually want each thread to exclusively own as much of the data that it has to work on (sharing writable data is expensive and may result in race conditions if not done properly). By restricting global variable access and making local variables and passing references you can more easily write code that is more thread (and signal handler) friendly.

As an example lets look at the simple puts function.

int puts(const char *s);

This function write a C string to standard output, which can be useful. You might write some complicated code that outputs messages about what it is doing at different stages of execution using puts.

 int my_complicated_code( int x, int y, int z);

Now, imagine that you call the function several times in the program, but one of those times you actually don't want it to write to standard output, but to some other FILE *. If all of your calls to puts were actually fputs, which takes a FILE * that tells what file to print to, this would be easy to accomplish if you changed my_complicated_code to take in a FILE * as well as it's other arguments.

 int my_complicated_code(int x, int y, int z, FILE * out_file);

Now you can decide which file it will print to at the time when you call my_complicated_code by passing it a reference to any FILE * you have (that is open for writing).

The same thing follows for arrays. The memcpy function would be much less useful if it only copied data to one particular location. Or if it only copied from one particular location, since it actually takes two references to arrays.

It is often easier to write unit tests for functions that take references too since they don't make assumptions about where the data they need is or what its name is. You don't have to keep updating an array with a certain name to mimic the input you want to test, just create a different array for each test and pass it to your function.

In many simple programs it may seem like it is easier to write code using global variables like this, but as programs get bigger this is not the case.


As an addition to all the other answers already giving good reasons: Every single decision in programming is a tradeoff between different advantages and disadvantages. Decades of programming experience by generations of programmers have shown that global state is a bad thing in most cases. There is even a programming paradigm built around the avoidance of it, taking it to the extreme of avoiding state at all:

http://en.wikipedia.org/wiki/Functional_programming

You may find it easier at the moment, but when your projects keep going to grow bigger and bigger, at some point you will find that you have implemented so many workarounds for the problems that came up in the meantime, that you will find yourself unable to maintain your own code.


  1. There is a difference in scope. If you declare "int arr[1000]" in your main() for instance, you cannot access it in your function "another_function()". You would have to explicitly pass it by reference to every other function in which you want to use it. If it were external, it would be accessible in every function.

  2. See (1.)


It's a maintenance issue too. Why would I want to have to track down some external somewhere when I can just look at the function and see what it is supposed to be?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜