C / C++

October 08, 2009

Cppref: reading cppreference.com docs offline, like man or info or perldoc

Today I created a tiny script called cppref, a wrapper for documents on cppreference.com.  Blurbs are:

  • docs are bundled with the interface, no network access required
  • works like man(1) or info(2) or perldoc(2)

It looks like follows.

$ cppref
You are here: C++ Reference

C++ Reference

                                  C++

General Topics                      * C++ Strings
                                    * C++ I/O
  * FAQ                                 + C++ String Streams
  * Pre-processor commands          * C++ Exceptions
  * Operator Precedence
  * Escape Sequences              C++ Standard Template Library (STL)
  * ASCII Chart
  * Data Types                      * Overview
  * Keywords                        * Iterators
                                    * C++ Algorithms
Standard C Library                  * C++ Vectors
                                    * C++ Double-Ended Queues
  * Overview                        * C++ Lists
  * Standard C I/O                  * C++ Priority Queues
  * Standard C String & Character   * C++ Queues
  * Standard C Math                 * C++ Stacks
  * Standard C Time & Date          * C++ Sets
  * Standard C Memory               * C++ Multisets
  * Other standard C functions      * C++ Maps
                                    * C++ Multimaps
                                    * C++ Bitsets

$ cppref vector
You are here: C++ Reference >> C++ Standard Template Library >> C++ Vectors

C++ Vectors

Vectors contain contiguous elements stored as an array.

Accessing members of a vector can be done in constant time, appending elements
to a vector can be done in amortized constant time, whereas locating a specific
value or inserting elements into the vector takes linear time.

Constructors create vectors and initialize them with some data
Operators    compare, assign, and access elements of a vector
assign       assign elements to a vector
at           returns an element at a specific location
back         returns a reference to last element of a vector
(snip)

Or if the specified term maps to multiple files,

$ cppref push_back        
multiple choices:
  stl::deque::push_back
  stl::list::push_back
  stl::vector::push_back
  string::push_back

By default, cppref uses w3m as its viewer, so you can follow the links to read the documents.

Cppref is available from search.cpan.org/dist/cppref or github.com/kazuho/cppref.  Hove fun!

August 26, 2009

Picoev: a tiny event loop for network applications, faster than libevent or libev

I am sure many programmers writing network applications have their own abstracting layers hiding the differences between various I/O multiplex APIs, like select(2), poll(2), epoll(2), ... And of course, I am one among them.  While writing mycached (see Mycached: memcached protocol support for MySQL for more information), I was at first considering of using libev for multiplexing socket I/Os.  Libevent was not an option since it does not (yet) provide multithreading support.

But it was a great pain for me to learn how to use libev.  I do not mean that its is an ugly product.  In fact, I think that it is a very well written, excellent library.  However, for me it was too much a boring task to learn how the things are abstracted, already being familiar with the problems it tries to hide.

So instead I thought it might be a good occasion to write my own library that could be used in any programs I may write in the future.  The result is picoev, and it is faster than libevent or libev!  The benchmark used is a re-modified version taken from libev.schmorp.de/bench.html and can be found here.

Why is it faster?  It is because it uses an array and a ring buffer of bit vectors as its internal structure.  Libevent and libev seem to use some kind of sorted tree to represent file descriptors.  However, if we concentrate on Un*x systems, there is a guarantee that the descriptors will a small positive integer.  Picoev utilizes the fact and stores information related to file descriptors (such as pointers to callback functions or callback arguments) in an array, resulting in a faster manipulation of socket states.

Another optimization technique used by picoev is not to use an ordered tree for keeping timeout information.  Generally speaking, most network applications do not require accurate timeouts.  Thus it is possible to use a ring buffer (a sliding array) of bit vectors for the purpose.  Each bit vector represents a set of file descriptors that time-outs at a given time.  Picoev uses 128 of bit vectors to represent timeouts, for example, the first bit vector represents the sockets that timeout a second after, the second bit vector representing them of two seconds after..., and the bit vectors slide every second.  If the maximum timeout required by the web application is greater than 128, the minimum granurality of timeout becomes two seconds.

I would like to reiterate that both libevent and libev are great libraries.  Picoev is not at comparable to them especially in maturity and the number of features.  It only supports select(2), epoll(2), and kqueue(2) for the time being.  However the design is simple than the other two, and I think it will be a good starting point to write network applications, or to use as a basis for writing one's own network libraries.

If you are interested in using picoev, the source code can be found at coderepos.org/share/browser/lang/c/picoev/.

July 02, 2009

今更 C++ で JSON パーサ「picojson」を書いたわけ

 既に mattn さんが、「Big Sky :: ヘッダファイルだけでC++から使えるJSONパーサ「picojson」が凄い!」で紹介してくださっています (mattn さん、アドバイス&バグ情報ありがとうございます!) が、いまさら C++ で JSON パーサを作りました。それは、以下の3点を満たすものがなかったから。

  • ヘッダファイル only
  • boost 等、他の重たいライブラリに依存しない
  • array や object が STL にマッピングされる

 コードは、coderepos に置いてありますので、よろしければお使いください (picojson.h)。

 なお、現時点での制限事項として、

  • \n や \r, \uXXXX といったエスケープの処理が未実装rev. 34232 で対応しました (含サロゲートペア)
  • 空白文字の判断基準が RFC と異なるrev. 34277 で空白と文字列定数内で許容する文字を RFC4627 に一致させました

といった点がありますので、解釈のゆれによってセキュリティ上問題が発生しうるケースでのご使用は注意していただければと思います。

 あとは、operator<< と >> を定義して、iostream から読み書きできるようにするくらいかなrev. 32477 の時点で operator<< と >> に対応済です。それでは have fun!

June 23, 2009

スレッド間で共有する変数のアクセス権制御を C++ コンパイラで強制する方法

 マルチスレッドなプログラムを書いていると、スレッド間で共有する変数へのアクセスを正しく直列化できているか、という点が常に問題になります。どうせなら、正しく書けているかコンパイル時に確認したいよね、ということで、以下のような C++ テンプレートを書いてみました。

template <typename T> class cac_mutex_t {
public:
 
  class lockref {
  protected:
    cac_mutex_t<T>* m_;
  public:
    lockref(cac_mutex_t<T>& m) : m_(&m) {
      pthread_mutex_lock(&m_->mutex_);
    }
    ~lockref() {
      pthread_mutex_unlock(&m_->mutex_);
    }
    T& operator*() { return m_->t_; }
    T* operator->() { return &operator*(); }
  private:
    lockref(const lockref&);
    lockref& operator=(const lockref&);
  };

protected:
  friend class cac_mutex_t<T>::lockref;
  T t_;
  pthread_mutex_t mutex_;
public:
  cac_mutex_t(pthread_mutexattr_t* attr) : t_() {
    pthread_mutex_init(&mutex_, attr);
  }
  ~cac_mutex_t() {
    pthread_mutex_destroy(&mutex_);
  }
  const T* unsafe_ref() const { return &t_; }
  T* unsafe_ref() { return &t_; }
private:
  cac_mutex_t(const cac_mutex_t&);
  cac_mutex_t& operator=(const cac_mutex_t&);
};

 使用例は以下のとおり。

// mutex lock が必須な Foo 型の変数を宣言
cac_mutex_t<Foo> foo_serialized(NULL);

{ // RAII で mutex lock してアクセス
  cac_mutex_t<Foo>::lockref foo(foo_serialized);
  foo->x = ...
  ...
}

 要は、RAII かつスマートポインタなクラスを使って、ロックをコントロールしつつ、アクセスの直列化を保証できるよ、ということです。

 テクニックとしては既存だと思いますが、ちょっとググった範囲では見つからなかったので、紹介させていただきました。

6月24日追記: 設計を整理、rwlock を追加して Q4M の svn に追加しました。(link)