I'm trying to fully understand what Martin explains in his discussion of encapsulation. He uses the C and C++ languages in his discussion.

I'm trying to fully understand what Martin explains in his discussion of encapsulation. He uses the C and C++ languages in his discussion.

Because I do not have a background in C/C++ I'm finding it difficult to understand what he is explaining. I assume that Delphi also breaks perfect encapsulation according to Martin's explanation.

I would very much appreciate some help it trying to understand what Martin is talking about. Especially within the context of how Delphi breaks perfect encapsulation.

Clean Architecture pages 34-37 (C) Robert Martin

Encapsulation?

The reason encapsulation is sited as part of the definition of OO is that OO languages provide easy easy and effective encapsulation of data and function. As a result, a line can be drawn around a cohesive set of data and functions. Outside of that line, the data is hidden and only some of the functions are known. We see this concept in action as the private data members and the public member functions of a class.

This idea is certainly not unique to OO. Indeed, we had perfect encapsulation in C. Consider this simple C program:

(See image 1)

The users of point.h have no access whatsoever to the members of struct Point. They can call the makePoint() function and the distance() function, but they have absolutely no knowledge of the implementation of either the Point data structure or the functions.

This is perfect encapsulation in a non OO language. C programmers used to this kind of thing all the time. We would forward declare data structures and functions in header files, and then implement them in implementation files. Our users never had access to the elements in those implementation files.

But then came OO in the form of C++ and the perfect encapsulation of C was broken.

The C++ compiler, for technical reasons (the C++ compiler needs to know the size of the instances of each class) needed the member variables of a class to be declared in the header file of that class. So our Point program changed to look like this:

(See image 2)

Clients of the header file point.h know about the member variables x and y! The compiler will prevent access to them, but the client still knows they exist. For example, if those member names are changed, the point.cc file must be recompiled! Encapsulation has been broken.

Indeed, they way encapsulation is partially repaired is by introducing the public, private, and protected keywords into the language. This, however, was a hack necessitated by the technical need for the compiler to see those variables in the header file.

Java and C# simply abolish the header/implementation split altogether, thereby weakening encapsulation even more. In these languages, it is impossible to separate the declaration and definition of a class.

For these reasons, it is difficult to accept that OO depends on strong encapsulation. Indeed, many OO languages (for example, Smalltalk, Python, JavaScript, Lua, and Ruby) have little or no enforced encapsulation.

OO certainly does depend on the idea that programmers are well behaved enough to not circumvent encapsulated data. Even so, the languages that claim to provide OO have only weakened perfect encapsulation we have enjoyed with C.



Comments

  1. You will need to learn enough C and C++ to understand the examples presented. Once you have that knowledge, the examples could not be any clearer.

    ReplyDelete
  2. AFAIK this is the problem

    {code}
    unit encapsulation;

    interface

    type
    TPoint = class
    constructor Create(x, y: Double);
    function Distance(p: TPoint): Double;
    end;

    implementation

    // how can I store x and y without declaring them in the interface part of the unit ?

    end.
    {code}

    but in Pascal there's not a .h and and .c, only one .pas :)

    anyway, with some nasty code it's possible :)

    {code}
    unit Points;

    interface

    type
    TPoint = class
    class function Create(x, y: Double): TPoint;
    function Distance(p: TPoint): Double;
    end;

    implementation

    type
    THiddenPoint = class(TPoint)
    x, y: Double;
    constructor Create;
    end;

    TPointHelper = class helper for TPoint
    function Hidden: THiddenPoint; inline;
    end;

    constructor THiddenPoint.Create;
    begin
    TObject.Create;
    end;

    function TPointHelper.Hidden: THiddenPoint;
    begin
    Result := THiddenPoint(Self);
    end;

    { TPoint }

    class function TPoint.Create(x, y: Double): TPoint;
    begin
    Result := THiddenPoint.Create;
    Result.Hidden.x := x;
    Result.Hidden.y := y;
    end;

    function TPoint.Distance(p: TPoint): Double;
    var
    dx, dy: Double;
    begin
    dx := Hidden.x - p.Hidden.x;
    dy := Hidden.y - p.Hidden.y;
    Result := sqrt(dx * dx + dy * dy);
    end;

    procedure test;
    var
    p1, p2: TPoint;
    begin
    p1 := TPoint.Create(0, 0);
    p2 := TPoint.Create(100, 0);
    assert(p1.Distance(p2) = 100);
    end;

    initialization
    test();
    end.
    {code}

    ReplyDelete
  3. As you know in C++ you can have headers (.h) and implementations (.cpp) and when you make a program you need both. When you are going to spread your code around you can give the header (.h) and the object file (.o) instead of the cpp.

    This is information hiding; in that way a person (with g++ for example) can link together the .h and the .o but he can only see the code in the header. Since you gave him the .o and NOT the .cpp he cannot see/steal your implementation!

    The technique is used when you want to hide almost everything because as you can see in the first version the heder file has the declaration and the definition of the class but the second version has just a declaration and a pointer.

    ReplyDelete
  4. Alberto Miola No, that is absolutely not what is being said.

    ReplyDelete
  5. >For these reasons, it is difficult to accept that OO depends on
    >strong encapsulation. Indeed, many OO languages (for
    >example, Smalltalk, Python, JavaScript, Lua, and Ruby) have
    >little or no enforced encapsulation.

    That's the real takeaway, which makes me wonder why he's whining about encapsulation in the first place.



    ReplyDelete
  6. His point seems to be that when exposing "plain" classes in C++ (and likewise in Delphi), you cannot hide your implementation details from the user.

    You can hide the details from the compiler by using private/protected/public visibility, but you're still "leaking" internal details.

    Of course, there are ways around this, usually by using pure virtual classes in C++ and interfaces in Delphi. By using an interface you can hide the implementation completely.

    That being said, I've found that more often than not, having access to internals is a good thing, as it allows for workarounds and extensions. For that reason I prefer to not hide the implementation as such. Rather I prefer to mark it as an implementation detail, with the understanding that the implementation details may change in future updates.

    ReplyDelete
  7. Interesting, never having been a C programmer, I didn't realise it supported pure encapsulation like that. That said, interface types surely give you the same thing, in fact even bog-standard polymorphism-by-inheritance does (or can do). Classic example in Delphi is probably TStrings, which provides most of the 'string list' functionality itself but leaves actual storage to a subclass - and the normal VCL pattern is to then bury the subclass (TMemoStrings etc.) in the implementation section of its unit.

    By the by, is the author claiming perfect encapsulation to be a good thing in general? I'd be interested in his reasoning if he does. Also, I'm not sure I'd call JavaScript an 'OO' language, though I realise some people do without flinching...

    ReplyDelete
  8. Asbjørn Heid Ah, you beat me to it! Good example of pure encapsulation being a right 'mare in Delphi-land is FMX's interfacing with the native API - it's mildly better now, but still key details are buried in implementation sections (the VCL TStrings principle ran amok) which makes it impossible to work around the numerous bugs and lack of built-in extensibility without forking the source.

    ReplyDelete
  9. When talking about any programming concept there are few things to keep in mind.

    First, language implementation details have great impact on how certain concept is represented in particular language. Second, some concepts evolved as solutions for particular problems that arise from specific language features (or lack of features).

    And the most important, the whole purpose of concepts and theory is to serve practice. So the question you have to ask yourself first is not what some concept means and how it can be implemented and how well. Primary question is how that particular concept helps in practice? What is its purpose?

    So, what is the purpose of encapsulation. It is not about pure data hiding, it is not about building Fort Knox around your data and class private members. It is about maintaining the stable public interface (API) so that any implementation details of some class or any other entity, that are subject to change don't leak out and force us to change zillions of lines of consumer code. It is also about providing clear separation between private and public parts, preventing accidental or usage of inner parts. Such separation makes using particular entity (class, record or whatever) easier.

    For instance, if you are dealing with XML parser, you are only interested in single public function - load, while class can contain hundreds of private functions used to achieve particular functionality. You don't want to burden consumers of such class with all that.

    Also, there are several levels of protection. Unless, you are making some modular design, where you can plug in different implementation while providing the same interface, you don't have to dwell on whether you will have to recompile only some parts or all code when you change implementation. In that light, whether or not you have double type exposed in interface (or header) file is really not an issue.

    Also, no matter how much you want to encapsulate something, there are some details that will inevitably leak. For instance, in mentioned Point example, hiding the fact that x and y are doubles is exercise in futility. The second you have constructor that takes two doubles, that information is leaking. All attempts to hide it are pointless.

    Some basic primitive types will always leak, and you cannot escape that fact. That Point class (struct) example is rather bad one. It is in the nature of Point that it cannot be absolutely encapsulated.

    Some classes can benefit from strict encapsulation, some don't. If you have class where knowledge of inner data is not crucial for class functionality, then it is better to hide details that can be hidden.

    So, while hiding x and y in Point is meaningless, in implementing Point list, you will want to hide that you are using array of points to store them. Exposing that array directly is breaking encapsulation, because that is the implementation detail that you may want and that can change. What you want to do is provide way to access specific points through getter and setter methods as well as providing enumerator to iterate through the list. So if you change array to TList, consumers of that class will not care and will not need to change.

    When you think about encapsulation, you have to think about how inner change affects consumers, but you generally don't have to go beyond that.

    TPoint = class
    public
    x, y: double;
    end;

    Above class in Delphi is properly encapsulated, even though it is wide open. How?

    Because, if the need arises, you can add additional level of protection through properties without breaking consumers.

    TPoint = class
    private
    Fx, Fy: double;
    public
    property x: double read Fx write Fx;
    property y: double read Fy write Fy;
    end;

    ReplyDelete
  10. Now, you have basically the same thing written in few lines more, but consumers will not suffer from that change. If you want to take it to the next level you can add setter method that can perform some specific checking.

    The only thing that can break above class is deciding that you will not allow setting x and y values after initial construction. But in that case any of other better encapsulated examples would be equally broken because you have changed public API.

    In that light, absolute encapsulation is not only impossible to achieve, but it is also not of absolute importance.

    ReplyDelete
  11. Dalija Prasnikar​​, thank you for your very thoughtful and extremely well worded answer. I appreciate you taking the time to point out the "how to think" about what encapsulation means.

    ReplyDelete
  12. Dalija Prasnikar That's not what Bob was talking about though, he'd regard your Delphi class as not encapsulated.

    ReplyDelete
  13. David Heffernan I know. The question is does it really matter?

    It only matters if you are theorizing, and has very little relevance for the actual coding practice.

    ReplyDelete
  14. Dalija Prasnikar "It only matters if you are theorizing, and has very little relevance for the actual coding practice."

    Now there's the second great takeaway from this discussion! :-)

    ReplyDelete
  15. Dalija Prasnikar "In that light, absolute encapsulation is not only impossible to achieve, but it is also not of absolute importance."

    Don't let Rudy read that! :-) I seem to remember him saying something to the effect of just because you let someone use your code it doesn't mean they can do whatever they want with it. :-)

    ReplyDelete
  16. Dalija Prasnikar It matters if you want to answer the question that Michael asked. It suppose it doesn't matter if you want to answer the question you would have preferred that he asked.

    I don't think Bob is making the point that the less than perfect encapsulation of major OO languages is a critical flaw. Read the final two paragraphs of the excerpt. He just says that OO doesn't depend on strong encapsulation.

    ReplyDelete
  17. David Heffernan Well, this is not Stack Overflow, so I don't have to answer the question asked ;)

    What is confusing here - and there is possibility that it is explained in more detail somewhere else in the book - is that Bob circles around perfect or not perfect encapsulation and how it is broken or not and whether it is actually necessary or not, without covering what is actually purpose of encapsulation.

    You can know mechanics of how to achieve it, but if you don't know why you need it then you only know the half of what you need to know.

    My point is that while dissecting the tree, you may completely miss the forest.

    ReplyDelete
  18. Dalija Prasnikar The text quoted by Michael is part of a large whole. The argument that Bob makes is perfectly reasonable when read as part of that larger whole. Which is why I think your criticisms are out of place. You are taking this text out of context.

    The larger whole is that Bob is arguing that encapsulation is not the be all and end all of OO languages.

    sm-dev.edutone.net - sm-dev.edutone.net/Architect/Clean%20Architecture.pdf

    Michael's problem is that he can't understand the C and C++ code and so can't understand the illustrations used to support the argument. I don't think Michael is really going to benefit from you presenting a different argument when he's trying to get to grips with what Bob is talking about.

    What Michael needs to do is learn enough C to understand incomplete type declarations and how they are used with the PIMPL pattern. Then this text will make complete sense, and the broader argument can be understood.

    ReplyDelete
  19. Paul TOTH, that concept seems similar to MVVM. Model, View, View Model.

    ReplyDelete
  20. David Heffernan As I said there is possibility that not knowing the whole book dilutes the author's message.

    But, regardless of the whole and regardless of whatever he want's to point out, he is doing the bad job with confusing example.

    Yes, maybe Michael Riley has additional issues in understanding the example because he is not familiar with the language, but I do understand the example and I still think it is confusing and doesn't illustrate the point well. Or should I say I am not sure what actually is the point he is trying to make.

    That encapsulation is not alpha and omega of OOP. It is not, but also saying that encapsulation is completely irrelevant in OOP as well as in other paradigms would also send the wrong picture. Because, like I said earlier it is not all about "how" but also "why".

    I could make another point proving that perfect encapsulation can be bad. Imagine his perfectly encapsulated C example where struct is implemented as

    struct Point{
    int x, y;
    }

    Surely, some implementation details do matter ;)

    I probably agree with Bob way more than it seems on the first sight, but I also disagree with him, too. And more often than not it is easy to misinterpret written words.

    ReplyDelete
  21. Dalija Prasnikar I think you are missing the point

    ReplyDelete
  22. David Heffernan Maybe... maybe not. Maybe the whole point is that the point is not clearly presented.

    ReplyDelete
  23. Dalija Prasnikar It's perfectly clear to me. It's only when you start taking this excerpt out of context that you go off the rails.

    ReplyDelete
  24. David Heffernan Hmm, while I get what it's saying, the original text is still playing on words to me. A necklace locked inside a glass capsule is no less 'encapsulated' because you can still see it through the glass... ergo fields being 'encapsulated' in an object via a 'private' modifier in the class declaration does not imply a 'broken' form of encapsulation.

    ReplyDelete
  25. Chris Rolliston He is not saying that encapsulation in OO languages is broken. He is saying that encapsulation is not the be all and end all of OO languages.

    ReplyDelete
  26. David Heffernan And precisely that point is lost through imperfect example.

    ReplyDelete
  27. David Heffernan​​ that exact phrase is quoted in Michael's original post, the argument being encapsulation can't be the be all and end all in OOP because actually existing OOP languages support only an imperfect form if it in the first place.

    ReplyDelete
  28. Dalija Prasnikar Perhaps you should read the entire chapter. None of this is helping Michael very much though.

    ReplyDelete
  29. David Heffernan An argument that conflates visibility with access, and purports to rebut the idea that encapsulation is the essence of OOP when it is really rebutting the idea that encapsulation is OOP's gift to the world (a far more dismissible claim)... and to boot, an argument also doesn't make any sense in a Pascal/Delphi context. I'm not surprised some people are getting confused TBH...

    ReplyDelete
  30. Michael Riley I think all you really need to get from this is that Delphi has the same situation -- you typically have to list the data members of a class in the interface part, which although it's not separated from the implementation part as it is in c/c++, is still needed by the compiler to calculate the size of the class.

    It's not eliminated simply by declaring an interface, because that just pushes the declaration down a level -- you cannot create an instance of an interface without having a concrete class defined somewhere.

    If you want to get really crazy about it, you'd create your classes in two "layers", where one would contain just the public elements, and a sub-layer that contains private elements.

    But I agree with David Heffernan in that you have to interpret this within the context it's presented in the chapter. It's not about "encapsulation" or whether it's "perfect" or not -- it's about "what constitutes object-oriented programming". Bob is basically shooting at sacred cows.

    I've seen some quite interesting open-source projects that were written in c99 that are structured to look and work similar to c++. The code reads like what you'd see if you were reading c++ code exposed through an x-ray machine, so to speak. Anybody will tell you that "c is NOT an OOP!" But reading this code, you'd be hard-pressed to agree with that assertion. It's written using a very disciplined coding style that makes it look and feel every bit as if it's an OOP. The fact is, the compiler is doing nothing to enforce what you get by a typical c++ compiler, but it's all there nonetheless.

    This is partially what Bob is getting at in this chapter. Y'all will go down a rat-hole arguing about whether such code is "OO" or not, simply because it's not using what might be considered an OO programming language. If you follow the discipline that the code embodies, then it's hard to say it's NOT OO, even though it's not using an OOP.

    ReplyDelete
  31. Michael Riley​ have a look at this talk by Uncle Bob https://youtu.be/TMuno5RZNeE

    ReplyDelete

Post a Comment