Contracts

Contracts in Solidity what classes are in object oriented languages. They persistent data in state variables and functions that can modify these variables. Calling a function on a different contract (instance) will perform an EVM function call and thus switch the context such that state variables are inaccessible.

Creating Contracts

Contracts can be created “from outside” or from Solidity contracts. When a contract is created, its constructor (a function with the same name as the contract) is executed once.

From web3.js, i.e. the JavaScript API, this is done as follows:

// The json abi array generated by the compiler
var abiArray = [
  {
    "inputs":[
      {"name":"x","type":"uint256"},
      {"name":"y","type":"uint256"}
    ],
    "type":"constructor"
  },
  {
    "constant":true,
    "inputs":[],
    "name":"x",
    "outputs":[{"name":"","type":"bytes32"}],
    "type":"function"
  }
];

var MyContract = web3.eth.contract(abiArray);
// deploy new contract
var contractInstance = MyContract.new(
  10,
  {from: myAccount, gas: 1000000}
);

Internally, constructor arguments are passed after the code of the contract itself, but you do not have to care about this if you use web3.js.

If a contract wants to create another contract, the source code (and the binary) of the created contract has to be known to the creator. This means that cyclic creation dependencies are impossible.

contract OwnedToken {
  // TokenCreator is a contract type that is defined below.
  // It is fine to reference it as long as it is not used
  // to create a new contract.
  TokenCreator creator;
  address owner;
  bytes32 name;
  // This is the constructor which registers the
  // creator and the assigned name.
  function OwnedToken(bytes32 _name) {
    owner = msg.sender;
    // We do an explicit type conversion from `address`
    // to `TokenCreator` and assume that the type of
    // the calling contract is TokenCreator, there is
    // no real way to check that.
    creator = TokenCreator(msg.sender);
    name = _name;
  }
  function changeName(bytes32 newName) {
    // Only the creator can alter the name --
    // the comparison is possible since contracts
    // are implicitly convertible to addresses.
    if (msg.sender == creator) name = newName;
  }
  function transfer(address newOwner) {
    // Only the current owner can transfer the token.
    if (msg.sender != owner) return;
    // We also want to ask the creator if the transfer
    // is fine. Note that this calls a function of the
    // contract defined below. If the call fails (e.g.
    // due to out-of-gas), the execution here stops
    // immediately.
    if (creator.isTokenTransferOK(owner, newOwner))
      owner = newOwner;
  }
}

contract TokenCreator {
  function createToken(bytes32 name)
       returns (OwnedToken tokenAddress)
  {
    // Create a new Token contract and return its address.
    // From the JavaScript side, the return type is simply
    // "address", as this is the closest type available in
    // the ABI.
    return new OwnedToken(name);
  }
  function changeName(OwnedToken tokenAddress, bytes32 name) {
    // Again, the external type of "tokenAddress" is
    // simply "address".
    tokenAddress.changeName(name);
  }
  function isTokenTransferOK(
      address currentOwner,
      address newOwner
  ) returns (bool ok) {
    // Check some arbitrary condition.
    address tokenAddress = msg.sender;
    return (sha3(newOwner) & 0xff) == (bytes20(tokenAddress) & 0xff);
  }
}

Visibility and Accessors

Since Solidity knows two kinds of function calls (internal ones that do not create an actual EVM call (also called a “message call”) and external ones that do), there are four types of visibilities for functions and state variables.

Functions can be specified as being external, public, internal or private, where the default is public. For state variables, external is not possible and the default is internal.

external:
External functions are part of the contract interface, which means they can be called from other contracts and via transactions. An external function f cannot be called internally (i.e. f() does not work, but this.f() works). External functions are sometimes more efficient when they receive large arrays of data.
public:
Public functions are part of the contract interface and can be either called internally or via messages. For public state variables, an automatic accessor function (see below) is generated.
internal:
Those functions and state variables can only be accessed internally (i.e. from within the current contract or contracts deriving from it), without using this.
private:
Private functions and state variables are only visible for the contract they are defined in and not in derived contracts.

The visibility specifier is given after the type for state variables and between parameter list and return parameter list for functions.

contract c {
  function f(uint a) private returns (uint b) { return a + 1; }
  function setData(uint a) internal { data = a; }
  uint public data;
}

Other contracts can call c.data() to retrieve the value of data in state storage, but are not able to call f. Contracts derived from c can call setData to alter the value of data (but only in their own state).

Accessor Functions

The compiler automatically creates accessor functions for all public state variables. The contract given below will have a function called data that does not take any arguments and returns a uint, the value of the state variable data. The initialization of state variables can be done at declaration.

The accessor functions have external visibility. If the symbol is accessed internally (i.e. without this.), it is a state variable and if it is accessed externally (i.e. with this.), it is a function.

contract test {
     uint public data = 42;
}

The next example is a bit more complex:

contract complex {
  struct Data { uint a; bytes3 b; mapping(uint => uint) map; }
  mapping(uint => mapping(bool => Data[])) public data;
}

It will generate a function of the following form:

function data(uint arg1, bool arg2, uint arg3) returns (uint a, bytes3 b)
{
  a = data[arg1][arg2][arg3].a;
  b = data[arg1][arg2][arg3].b;
}

Note that the mapping in the struct is omitted because there is no good way to provide the key for the mapping.

Function Modifiers

Modifiers can be used to easily change the behaviour of functions, for example to automatically check a condition prior to executing the function. They are inheritable properties of contracts and may be overridden by derived contracts.

contract owned {
  function owned() { owner = msg.sender; }
  address owner;

  // This contract only defines a modifier but does not use
  // it - it will be used in derived contracts.
  // The function body is inserted where the special symbol
  // "_" in the definition of a modifier appears.
  // This means that if the owner calls this function, the
  // function is executed and otherwise, an exception is
  // thrown.
  modifier onlyowner { if (msg.sender != owner) throw; _ }
}
contract mortal is owned {
  // This contract inherits the "onlyowner"-modifier from
  // "owned" and applies it to the "close"-function, which
  // causes that calls to "close" only have an effect if
  // they are made by the stored owner.
  function close() onlyowner {
    selfdestruct(owner);
  }
}
contract priced {
  // Modifiers can receive arguments:
  modifier costs(uint price) { if (msg.value >= price) _ }
}
contract Register is priced, owned {
  mapping (address => bool) registeredAddresses;
  uint price;
  function Register(uint initialPrice) { price = initialPrice; }
  function register() costs(price) {
    registeredAddresses[msg.sender] = true;
  }
  function changePrice(uint _price) onlyowner {
    price = _price;
  }
}

Multiple modifiers can be applied to a function by specifying them in a whitespace-separated list and will be evaluated in order. Explicit returns from a modifier or function body immediately leave the whole function, while control flow reaching the end of a function or modifier body continues after the “_” in the preceding modifier. Arbitrary expressions are allowed for modifier arguments and in this context, all symbols visible from the function are visible in the modifier. Symbols introduced in the modifier are not visible in the function (as they might change by overriding).

Constants

State variables can be declared as constant (this is not yet implemented for array and struct types and not possible for mapping types).

contract C {
  uint constant x = 32**22 + 8;
  string constant text = "abc";
}

This has the effect that the compiler does not reserve a storage slot for these variables and every occurrence is replaced by their constant value.

The value expression can only contain integer arithmetics.

Fallback Function

A contract can have exactly one unnamed function. This function cannot have arguments and is executed on a call to the contract if none of the other functions matches the given function identifier (or if no data was supplied at all).

Furthermore, this function is executed whenever the contract receives plain Ether (witout data). In such a context, there is very little gas available to the function call, so it is important to make fallback functions as cheap as possible.

contract Test {
  function() { x = 1; }
  uint x;
}

// This contract rejects any Ether sent to it. It is good
// practise to include such a function for every contract
// in order not to loose Ether.
contract Rejector {
  function() { throw; }
}

contract Caller {
  function callTest(address testAddress) {
    Test(testAddress).call(0xabcdef01); // hash does not exist
    // results in Test(testAddress).x becoming == 1.
    Rejector r = Rejector(0x123);
    r.send(2 ether);
    // results in r.balance == 0
  }
}

Events

Events allow the convenient usage of the EVM logging facilities, which in turn can be used to “call” JavaScript callbacks in the user interface of a dapp, which listen for these events.

Events are inheritable members of contracts. When they are called, they cause the arguments to be stored in the transaction’s log - a special data structure in the blockchain. These logs are associated with the address of the contract and will be incorporated into the blockchain and stay there as long as a block is accessible (forever as of Frontier and Homestead, but this might change with Serenity). Log and event data is not accessible from within contracts (not even from the contract that created a log).

SPV proofs for logs are possible, so if an external entity supplies a contract with such a proof, it can check that the log actually exists inside the blockchain (but be aware of the fact that ultimately, also the block headers have to be supplied because the contract can only see the last 256 block hashes).

Up to three parameters can receive the attribute indexed which will cause the respective arguments to be searched for: It is possible to filter for specific values of indexed arguments in the user interface.

If arrays (including string and bytes) are used as indexed arguments, the sha3-hash of it is stored as topic instead.

The hash of the signature of the event is one of the topics except if you declared the event with anonymous specifier. This means that it is not possible to filter for specific anonymous events by name.

All non-indexed arguments will be stored in the data part of the log.

contract ClientReceipt {
  event Deposit(
    address indexed _from,
    bytes32 indexed _id,
    uint _value
  );
  function deposit(bytes32 _id) {
    // Any call to this function (even deeply nested) can
    // be detected from the JavaScript API by filtering
    // for `Deposit` to be called.
    Deposit(msg.sender, _id, msg.value);
  }
}

The use in the JavaScript API would be as follows:

var abi = /* abi as generated by the compiler */;
var ClientReceipt = web3.eth.contract(abi);
var clientReceipt = ClientReceipt.at(0x123 /* address */);

var event = clientReceipt.Deposit();

// watch for changes
event.watch(function(error, result){
  // result will contain various information
  // including the argumets given to the Deposit
  // call.
  if (!error)
    console.log(result);
});

// Or pass a callback to start watching immediately
var event = clientReceipt.Deposit(function(error, result) {
  if (!error)
    console.log(result);
});

Low-Level Interface to Logs

It is also possible to access the low-level interface to the logging mechanism via the functions log0, log1, log2, log3 and log4. logi takes i + 1 parameter of type bytes32, where the first argument will be used for the data part of the log and the others as topics. The event call above can be performed in the same way as

log3(
  msg.value,
  0x50cb9fe53daa9737b786ab3646f04d0150dc50ef4e75f59509d83667ad5adb20,
  msg.sender,
  _id
);

where the long hexadecimal number is equal to sha3(“Deposit(address,hash256,uint256)”), the signature of the event.

Additional Resources for Understanding Events

Inheritance

Solidity supports multiple inheritance by copying code including polymorphism.

All function calls are virtual, which means that the most derived function is called, except when the contract is explicitly given.

Even if a contract inherits from multiple other contracts, only a single contract is created on the blockchain, the code from the base contracts is always copied into the final contract.

The general inheritance system is very similar to Python’s, especially concerning multiple inheritance.

Details are given in the following example.

contract owned {
    function owned() { owner = msg.sender; }
    address owner;
}

// Use "is" to derive from another contract. Derived
// contracts can access all non-private members including
// internal functions and state variables. These cannot be
// accessed externally via `this`, though.
contract mortal is owned {
    function kill() {
      if (msg.sender == owner) selfdestruct(owner);
    }
}

// These abstract contracts are only provided to make the
// interface known to the compiler. Note the function
// without body. If a contract does not implement all
// functions it can only be used as an interface.
contract Config {
    function lookup(uint id) returns (address adr);
}
contract NameReg {
    function register(bytes32 name);
    function unregister();
 }

// Multiple inheritance is possible. Note that "owned" is
// also a base class of "mortal", yet there is only a single
// instance of "owned" (as for virtual inheritance in C++).
contract named is owned, mortal {
    function named(bytes32 name) {
        Config config = Config(0xd5f9d8d94886e70b06e474c3fb14fd43e2f23970);
        NameReg(config.lookup(1)).register(name);
    }

    // Functions can be overridden, both local and
    // message-based function calls take these overrides
    // into account.
    function kill() {
        if (msg.sender == owner) {
            Config config = Config(0xd5f9d8d94886e70b06e474c3fb14fd43e2f23970);
            NameReg(config.lookup(1)).unregister();
            // It is still possible to call a specific
            // overridden function.
            mortal.kill();
        }
    }
}

// If a constructor takes an argument, it needs to be
// provided in the header (or modifier-invocation-style at
// the constructor of the derived contract (see below)).
contract PriceFeed is owned, mortal, named("GoldFeed") {
   function updateInfo(uint newInfo) {
      if (msg.sender == owner) info = newInfo;
   }

   function get() constant returns(uint r) { return info; }

   uint info;
}

Note that above, we call mortal.kill() to “forward” the destruction request. The way this is done is problematic, as seen in the following example:

contract mortal is owned {
    function kill() {
        if (msg.sender == owner) selfdestruct(owner);
    }
}
contract Base1 is mortal {
    function kill() { /* do cleanup 1 */ mortal.kill(); }
}
contract Base2 is mortal {
    function kill() { /* do cleanup 2 */ mortal.kill(); }
}
contract Final is Base1, Base2 {
}

A call to Final.kill() will call Base2.kill as the most derived override, but this function will bypass Base1.kill, basically because it does not even know about Base1. The way around this is to use super:

contract mortal is owned {
    function kill() {
        if (msg.sender == owner) selfdestruct(owner);
    }
}
contract Base1 is mortal {
    function kill() { /* do cleanup 1 */ super.kill(); }
}
contract Base2 is mortal {
    function kill() { /* do cleanup 2 */ super.kill(); }
}
contract Final is Base2, Base1 {
}

If Base1 calls a function of super, it does not simply call this function on one of its base contracts, it rather calls this function on the next base contract in the final inheritance graph, so it will call Base2.kill() (note that the final inheritance sequence is – starting with the most derived contract: Final, Base1, Base2, mortal, owned). The actual function that is called when using super is not known in the context of the class where it is used, although its type is known. This is similar for ordinary virtual method lookup.

Arguments for Base Constructors

Derived contracts need to provide all arguments needed for the base constructors. This can be done at two places:

contract Base {
  uint x;
  function Base(uint _x) { x = _x; }
}
contract Derived is Base(7) {
  function Derived(uint _y) Base(_y * _y) {
  }
}

Either directly in the inheritance list (is Base(7)) or in the way a modifier would be invoked as part of the header of the derived constructor (Base(_y * _y)). The first way to do it is more convenient if the constructor argument is a constant and defines the behaviour of the contract or describes it. The second way has to be used if the constructor arguments of the base depend on those of the derived contract. If, as in this silly example, both places are used, the modifier-style argument takes precedence.

Multiple Inheritance and Linearization

Languages that allow multiple inheritance have to deal with several problems, one of them being the Diamond Problem. Solidity follows the path of Python and uses “C3 Linearization” to force a specific order in the DAG of base classes. This results in the desirable property of monotonicity but disallows some inheritance graphs. Especially, the order in which the base classes are given in the is directive is important. In the following code, Solidity will give the error “Linearization of inheritance graph impossible”.

contract X {}
contract A is X {}
contract C is A, X {}

The reason for this is that C requests X to override A (by specifying A, X in this order), but A itself requests to override X, which is a contradiction that cannot be resolved.

A simple rule to remember is to specify the base classes in the order from “most base-like” to “most derived”.

Abstract Contracts

Contract functions can lack an implementation as in the following example (note that the function declaration header is terminated by ;):

contract feline {
  function utterance() returns (bytes32);
}

Such contracts cannot be compiled (even if they contain implemented functions alongside non-implemented functions), but they can be used as base contracts:

contract Cat is feline {
  function utterance() returns (bytes32) { return "miaow"; }
}

If a contract inherits from an abstract contract and does not implement all non-implemented functions by overriding, it will itself be abstract.

Libraries

Libraries are similar to contracts, but their purpose is that they are deployed only once at a specific address and their code is reused using the CALLCODE feature of the EVM. This means that if library functions are called, their code is executed in the context of the calling contract, i.e. this points to the calling contract and especially the storage from the calling contract can be accessed. As a library is an isolated piece of source code, it can only access state variables of the calling contract if they are explicitly supplied (it would have to way to name them, otherwise).

The following example illustrates how to use libraries (but be sure to check out using for for a more advanced example to implement a set).

library Set {
  // We define a new struct datatype that will be used to
  // hold its data in the calling contract.
  struct Data { mapping(uint => bool) flags; }
  // Note that the first parameter is of type "storage
  // reference" and thus only its storage address and not
  // its contents is passed as part of the call.  This is a
  // special feature of library functions.  It is idiomatic
  // to call the first parameter 'self', if the function can
  // be seen as a method of that object.
  function insert(Data storage self, uint value)
      returns (bool)
  {
    if (self.flags[value])
      return false; // already there
    self.flags[value] = true;
    return true;
  }
  function remove(Data storage self, uint value)
    returns (bool)
  {
    if (!self.flags[value])
      return false; // not there
    self.flags[value] = false;
    return true;
  }
  function contains(Data storage self, uint value)
    returns (bool)
  {
    return self.flags[value];
  }
}
contract C {
  Set.Data knownValues;
  function register(uint value) {
    // The library functions can be called without a
    // specific instance of the library, since the
    // "instance" will be the current contract.
    if (!Set.insert(knownValues, value))
      throw;
  }
  // In this contract, we can also directly access knownValues.flags, if we want.
}

Of course, you do not have to follow this way to use libraries - they can also be used without defining struct data types, functions also work without any storage reference parameters, can have multiple storage reference parameters and in any position.

The calls to Set.contains, Set.insert and Set.remove are all compiled as calls (CALLCODE`s) to an external contract/library. If you use libraries, take care that an actual external function call is performed, so `msg.sender does not point to the original sender anymore but to the the calling contract and also msg.value contains the funds sent during the call to the library function.

As the compiler cannot know where the library will be deployed at, these addresses have to be filled into the final bytecode by a linker (see [Using the Commandline Compiler](#using-the-commandline-compiler) on how to use the commandline compiler for linking). If the addresses are not given as arguments to the compiler, the compiled hex code will contain placeholders of the form __Set______ (where Set is the name of the library). The address can be filled manually by replacing all those 40 symbols by the hex encoding of the address of the library contract.

Restrictions for libraries in comparison to contracts:

  • no state variables
  • cannot inherit nor be inherited

(these might be lifted at a later point)

Common pitfalls for libraries

The value of msg.sender

The value for msg.sender will be that of the contract which is calling the library function.

For example, if A calls contract B which internally calls library C, then within the function call of library C, msg.sender will be the address of contract B.

The reason for this is that the expression LibraryName.functionName() performs an external function call using CALLCODE, which maps to a real EVM call just like otherContract.functionName() or this.functionName(). This call extends the call depth by one (limited to 1024), stores the caller (the current contract) as msg.sender, and then executes the library contract’s code against the current contracts storage. This execution occurs in a completely new memory context meaning that memory types will be copied and cannot be passed by reference.

Transferring Ether

It is in principle possible to transfer ether using LibraryName.functionName.value(x)(), but as CALLCODE is used, the Ether will just end up at the current contract.

Using For

The directive using A for B; can be used to attach library functions (from the library A) to any type (B). These functions will receive the object they are called on as their first parameter (like the self variable in Python).

The effect of using A for *; is that the functions from the library A are attached to any type.

In both situations, all functions, even those where the type of the first parameter does not match the type of the object, are attached. The type is checked at the point the function is called and function overload resolution is performed.

The using A for B; directive is active for the current scope, which is limited to a contract for now but will be lifted to the global scope later, so that by including a module, its data types including library functions are available without having to add further code.

Let us rewrite the set example from the Libraries in this way:

// This is the same code as before, just without comments
library Set {
  struct Data { mapping(uint => bool) flags; }
  function insert(Data storage self, uint value)
      returns (bool)
  {
    if (self.flags[value])
      return false; // already there
    self.flags[value] = true;
    return true;
  }
  function remove(Data storage self, uint value)
    returns (bool)
  {
    if (!self.flags[value])
      return false; // not there
    self.flags[value] = false;
    return true;
  }
  function contains(Data storage self, uint value)
    returns (bool)
  {
    return self.flags[value];
  }
}

contract C {
  using Set for Set.Data; // this is the crucial change
  Set.Data knownValues;
  function register(uint value) {
    // Here, all variables of type Set.Data have
    // corresponding member functions.
    // The following function call is identical to
    // Set.insert(knownValues, value)
    if (!knownValues.insert(value))
      throw;
  }
}

It is also possible to extend elementary types in that way:

library Search {
  function indexOf(uint[] storage self, uint value) {
    for (uint i = 0; i < self.length; i++)
      if (self[i] == value) return i;
    return uint(-1);
  }
}

contract C {
  using Search for uint[];
  uint[] data;
  function append(uint value) {
    data.push(value);
  }
  function replace(uint _old, uint _new) {
    // This performs the library function call
    uint index = data.find(_old);
    if (index == -1)
      data.push(_new);
    else
      data[index] = _new;
  }
}

Note that all library calls are actual EVM function calls. This means that if you pass memory or value types, a copy will be performed, even of the self variable. The only situation where no copy will be performed is when storage reference variables are used.