Function Hijacking Mitigation




by Walter Bright
Digital Mars
http://www.digitalmars.com





























Global Function Hijacking

application that imports two modules:

X and Y are unrelated to each other, and are used for completely different purposes.









module X;

void foo();
void foo(long);
module Y;

void bar();
The application program would look like:
import X;
import Y;

void abc()
{
  foo(1); // calls X.foo(long)
}

void def()
{
  bar();  // calls Y.bar();
}








  1. so far, so good
  2. application is tested and works
  3. application is shipped
  4. time goes by
  5. application programmer moves on
  6. application is put in maintenance mode








and then...









module Y;

void bar();
class A;
void foo(A);








  1. application maintainer gets the latest version of Y
  2. recompiles
  3. no problems
  4. but then...








YYY Corporation expands the functionality of foo(A), adding a function foo(int):

module Y;

void bar();
class A;
void foo(A);
void foo(int);

Suddenly something unexpected happens to our application:

import X;
import Y;

void abc()
{
  foo(1); // calls Y.foo(int)
          // not X.foo(long)
}

void def()
{
  bar();
}

The problem is, this is how overloading is supposed to work!









Mitigation?

The module developer can mitigate by:

But that's no guarantee, and there's nothing the user can do about it.









Fixing the Language

The first stab:

  1. by default functions can only overload against other functions in the same module
  2. if a name is found in more than one scope, in order to use it it must be fully qualified
  3. in order to overload functions from multiple modules together, an alias statement is used to merge the overloads

application maintainer now gets a compilation error that foo is defined in both module X and module Y

















Overload Sets

formed by a group of functions with the same name declared in the same scope.

  1. X.foo() and X.foo(long) form one overload set
  2. Y.foo(A) and Y.foo(int) form another overload set

Our method for resolving a call to foo becomes:

  1. Perform overload resolution independently on each overload set
  2. If there is no match in any overload set, then error
  3. If there is a match in exactly one overload set, then go with that
  4. If there is a match in more than one overload set, then error








Most Importantly

even if there is a BETTER match in one overload set over another overload set, it is still an error. The overload sets must not overlap.









void abc()
{
 foo(1); // matches Y.foo(int)
         // matches X.foo(long)
         // error!
 A a;
 foo(a); // matches Y.foo(A)
         // no match in X
 foo();  // matches X.foo()
         // no match in Y
}








to overload foo between X and Y:

import X;
import Y;

alias X.foo foo;
alias Y.foo foo;

void abc()
{
 foo(1); // calls Y.foo(int)
         // not X.foo(long)
}

Hijacking can happen here, but user deliberately conflated the overload sets.









Derived Class Member Function Hijacking

Imagine a class A coming from AAA Corporation:

module M;

class A { }

Application code derives from A and adds virtual member function foo:

import M;

class B : A
{
 void foo(long);
}

void abc(B b)
{
 b.foo(1); //calls B.foo(long)
}

AAA Corporation (who cannot know about B) extends A's functionality by adding foo(int):

module M;

class A
{
  void foo(int);
}

Assume Java-style overloading rules: base class member functions overload right alongside derived class functions.

import M;

class B : A
{
  void foo(long);
}

void abc(B b)
{
 b.foo(1); //calls A.foo(int)
}

A.foo(int) hijacked call to B.foo(long).









Mitigation

In C++, functions in a derived class hide all the functions of the same name in a base class.

Even if the functions in the base class might be a better match.

Overloading can still be done with using declaration.

D follows the same method.









Base Class Member Function Hijacking

Hijacking can go the other way, too.

A derived class can hijack a base class member function!

module M;

class A
{
    void def() { }
}

application code derives from A, adds virtual member function foo:

import M;

class B : A
{
  void foo();
}

void abc(B b)
{
  b.def(); // calls A.def()
}

AAA Corporation once again knows nothing about B.

AAA adds function foo()

uses it to implement some new functionality of A

module M;

class A
{
  void foo();

  void def()
  {
    foo(); // expects A.foo()
           // but gets B.foo()
  }
}

B.foo() has hijacked A.foo()!

Shouldn't A.foo() be non-virtual?

No way to safely add functionality to A.









Solution: Qualify with overload

To override function in a base class, use the storage class override.

Error if:

class C
{
  void foo();
  void bar();
}
class D : C
{
  override void foo(); //ok
  void bar();          //error
           //overrides C.bar()
  override void abc(); //error
                  //no C.abc()
}

eliminates this form of hijacking









Derived Class Member Function Hijacking #2

module A;

class A
{
  void def()
  {
    foo(1);
  }

  void foo(long);
}

foo(long) is a virtual function that provides a specific functionality.

Our user overrides foo(long):

import A;

class B
{
  override void foo(long);
}

void abc(B b)
{
  b.def(); // eventually calls
           // B.foo(long)
}

call to foo(1) inside A winds up correctly calling B.foo(long).

A's designer decides to optimize things, and adds an overload for foo:

module A;

class A
{
  void def()
  {
    foo(1);
  }

  void foo(long);
  void foo(int);
}

Again, our user class:

import A;

class B
{
  override void foo(long);
}

void abc(B b)
{
  b.def(); //eventually calls
           //A.foo(int)
}

B is no longer overriding A's foo!

It's been hijacked by the base class.

B needs to add another function:

class B
{
    override void foo(long);
    override void foo(int);
}

But there's no indication this must be done.









A's vtbl[] looks like:

A.vtbl[0] = &A.foo(long);
A.vtbl[1] = &A.foo(int);

B's vtbl[] looks like:

B.vtbl[0] = &B.foo(long);
B.vtbl[1] = &A.foo(int);

call in A.def() to foo(int) is actually a call to vtbl[1].

We'd really like A.foo(int) to be inaccessible from a B object.









Solution: Fix the vtbl[]

The solution is to rewrite B's vtbl[] as:

B.vtbl[0] = &B.foo(long);
B.vtbl[1] = &error;

calling vtbl[1] means error() is called instead

which throws an exception

not caught at compile time, but at least it's caught









Conclusion









Discussion