Saturday, January 30, 2016

Refactoring: Change unidirectional association to bidirectional

Many times, when coding, we use different data structures: stacks, queues, trees.

These structures have the elements linked between them, allowing parsing and searching.

Usually when we start coding such a structure we consider a unidirectional link between the elements, mostly to save memory.

Let's consider the following class:


class Person {
    var name: String?
    var kids = [Person] ()

    func description() -> String {
        return "Person: name = \(name)"
    }   
}

Now, let's initialize few objects.

The name of the objects is totally against the rules we discussed in the naming post, but we would not call ourselves real deal programmers if we did not break some rules now and then.

let john = Person()
john.name = "John"

let jimmy = Person()
jimmy.name = "Jimmy"

let mary = Person()
mary.name = "Mary"

Let's now create an array of all the persons involved:

let personArray = [john, jimmy, mary]

And also let's add some family ties:

john.kids = [jimmy, mary]

Now, our task is to create a function to find out the parent of a person, if any. And here is the code:

func findParent(person: Person) -> Person? {
    for aPerson in personArray {
        for kid in aPerson.kids {
            if kid === person {
                return aPerson
            }
        }
    }
    return nil
}

And here is the code testing it:

if let parent = findParent(mary) {
    print(parent.description())
}

All is good in this case. We have only three people so the result is instantaneous. 

In real life we can have thousands of records maybe millions. 

Imagine a database with one million people and let's compute the time it takes to find out the list of people with parents.

For each person we have to check all the other one million people (of course, we need to check 999 999, but let's ignore that small detail).

Also, let's assume we have a 2.5 Ghz processor and that it takes ten cycles to check if two persons are the same. Of course this is not true, but just for the sake of calculation let's assume that.

The time it will take will be:

1 000 000 persons x 1 000 000 persons  x 10 cycles / 2 500 000 000 cycles/sec = 4000 sec = over 1h

This is not acceptable.

And here comes our refactoring trick.

Instead of a unidirectional association we can introduce a bidirectional one, just by adding another property:

class Person1 {
    var name: String?
    var kids = [Person] ()
    var parent: Person?
}

In this case, finding the parents of all the persons will take:

1 000 000 persons  x 10 cycles / 2 500 000 000 cycles/sec = 1 / 250 sec = 0.004 sec

We would need to change the code that initialize the database by making sure each person object has the parent property set, if we have this information.

class Person {
    var name: String?
    var kids = [Person] ()
    var parent: Person?

    func description() -> String {
        return "Person: name = \(name)"

    }
}

let john = Person()
john.name = "John"

let jimmy = Person()
jimmy.name = "Jimmy"
jimmy.parent = john

let mary = Person()
mary.name = "Mary"
mary.parent = john

john.kids = [jimmy,mary]

Of course the function findParent does not make sense. We just need to access mary.parent to find out her parent.

No comments:

Post a Comment