rainbow/docs/old/internal-coupling


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107

== Problem: Internal Coupling, Interfaces, and User Experience == #???

  The last few weeks of broader use of the rainbow codebase have revealed three
  problems that we need to fix in preparation for broader deployment.

Situation:

  First, olpc-update is too strongly coupled to Rainbow. This is a problem
  because it means that it's hard to modify either piece of software without
  unintentionally breaking the other.

  Next, Rainbow does not have a workable user-interface. Users of the software
  have universally been unable to follow or to capture rainbow traces and are
  frequently frustrated because rainbow fails offer human-readable explanations
  of why it is failing.  Two important facets of this issue are (the violations
  of the Principle of Least Surprise and the Principle of Silence) and (you can
  only interpret the traces if you know exactly what the trace is *supposed* to
  be). Also, the traces are not packaged in form convenient for uploading to me
  or, better, automatically aggregated.

  Finally, Rainbow's internal interfaces are too strongly coupled. This hinders
  development because, without a test suite, console entry points, and
  cut-points (in the AOP sense), it's very hard to modify the software without
  understanding the whole code base. In particular, you can't really produce an
  known good initial configuration, then iterate changes against that
  configuration).

Thoughts:

  I intend to separate olpc-update out from rainbow by splitting rainbow up
  into a pipeline of scripts, probably organized along the current 'stage'
  boundaries, that more exhaustively verify their starting and ending
  assumptions.

  This decomposition will accomplish several things:

  1) it will allow me to properly split olpc-update into a separate package and
  should reduce code-breakage resulting from changes made to one component that
  violate the assumptions made by a different component.

  2) it will give me a vehicle for automatically collecting trace-data from
  third-party users - I will simply ask them to exercise the shell pipeline
  through a wrapper that records full traces, that nicely packages them, and
  that attempts to send them to me.

  3) it will clarify the locations in the design in which outside data is
  processed and it will document the protocol by which these interactions
  occur, e.g. by forcing these data to be passed through introspectable media
  like files, environment variables, and command-line options

  4) it will clarify the life-cycle of contained applications

  5) it will provide a better opportuntity to exhaustively verify the
  assumptions of each stage before continuing

  6) it will make it easy and robust to start using a prototype process simply
  by adding a wrapper to the chain


Downsides:

  * It adds a lot of option-parsing complexity
      - The interfaces need to be documented anyway, and being able to pick up
        part-way through an interaction / single-step could be a big plus
  * It probably uses more resources than the current architecture
      - The current architecture is just not adequately transparent; therefore,
        it can be seen as a premature optimization.
      - Also, if it means that we get prototypes, we probably win.
  * it's not clear that it will actually be easier to give good user feedback
      - it will be easier to debug, though, because you can start part-way
        through and because you don't have to work *through* sugar to try
        things.
      - one could, for example, experiment with new pieces of isolation without
        dealing with Sugar and the Terminal.
  * do we have any data that we have to communicate backwards? If so, how are
    we going to accomplish this?
      - perhaps we could take the python-web-server idea of having a response
        object that you piece together and ultimately return
      - The chaining model is basically a linear graph of continuations
        We could expand on this by putting together a more interesting graph;
        e.g, by adding failure continuations
  * why not go with a web-server?
      - It still feels too monolithic and inadequately transparent to me
        I really want to be able to examine the intermediate state of the
        system with find and grep.

Model:

  The basic model being proposed here is to use the filesystem as an abstract
  store, to regard individual shell scripts as being fragments of an overall
  program (written in CPS) with *well-defined interfaces*, and to thereby
  expose the overall computation being performed to analysis by normal unix
  filesystem tools.

Questions for the Model:
  * Are the individual chunks are supposed to execute atomically with
    respect to one another?

Questions for the Store:
  * Is the store a naïve Scheme store or is it like the C heap?
  * Does the store preserve old bindings for inspection?
  * Perhaps it records the sequence in which bindings are added?
  * Can it record who made a binding, and why?
  * Thread-local or global?
  * Configurable location?
  * File-system or environment-variable + tempfile?
  * Typed or untyped? (Pickles, JSON, repr)?