GLIBC strlen源代码分析


		tony bai
直接操作C标准库提供的字符串操作函数是有一定风险的，稍有不慎就会导致内存问题。这周用业余时间写了一个小型的安全字符串操作库，但是测试之后才发现自己的实现有很大的性能缺陷。
在Solaris上初步做了一个简单的性能比对，以下是得到的性能数据(以strlen的数据为例)：
当传入的字符串长度为10时，执行100w次：
strlen 执行时间是：32762毫秒
my_strlen执行时间是：491836毫秒
当传入的字符串长度为20时，执行100w次：
strlen 执行时间是：35075毫秒
my_strlen执行时间是：770397毫秒
很显然，标准库中strlen的消耗仅是my_strlen的十分之一不到，且其性能消耗随着字符串长度的增加并未有近线性的增加，而my_strlen则是变化明显。想必大家这时也能猜到my_strlen采用了传统的实现的方式，即采用逐个字节判断是否为方式，这也与测试出的现象相符。本着刨根问底的精神，我在网上找到了GNU提供的C标准库中strlen实现的源码，要看看GLIBC中strlen究竟采用何种技巧才达到了那么高的性能。说实话在性能优化这方面自己一直还处于比较初级的位置，这也将是自己将来努力的一个方向。
下载了全部GLIBC的代码包，这个包还真不小。在string子目录下找到strlen.c，这就是大多数UNIX平台、Linux平台以及绝大多数GNU软件使用的strlen的实现源码了。这份代码由Torbjorn Granlund(还实现了memcpy)编写，Jim Blandy和Dan Sahlin提供了帮助和注释。包括注释在内，GLIBC的strlen的代码足足有近130行，大致浏览一下， 没有怎么看懂，可耐下心来细致阅读，还是有些心得的。下面是strlen源码摘要版，后面我将针对这段代码写一些我的理解：
  
  1 /* Return the length of the null-terminated string STR.  Scan for
  2    the null terminator quickly by testing four bytes at a time.  */
  3 size_t strlen (str)  const char *str;
  4 {
  5         const char *char_ptr;
  6         const unsigned long int *longword_ptr;
  7         unsigned long int longword, magic_bits, himagic, lomagic;
  8 
  9         /* Handle the first few characters by reading one character at a time.
 10            Do this until CHAR_PTR is aligned on a longword boundary.  */
 11 
 12         for (char_ptr = str; ((unsigned long int) char_ptr
 13              & (sizeof (longword) - 1)) != 0;
 14              ++char_ptr)
 15                 if (*char_ptr == )
 16                         return char_ptr - str;
 17 
 18         /* All these elucidatory comments refer to 4-byte longwords,
 19            but the theory applies equally well to 8-byte longwords.  */
 20 
 21         longword_ptr = (unsigned long int *) char_ptr;
 22 
 23         himagic = 0x80808080L;
 24         lomagic = 0x01010101L;
 25 
 26         if (sizeof (longword) > 8)
 27                 abort ();
 28 
 29         /* Instead of the traditional loop which tests each character,
 30            we will test a longword at a time.  The tricky part is testing
 31            if *any of the four* bytes in the longword in question are zero.  */
 32 
 33         for (;;)      
 34         {                         
 35                 longword = *longword_ptr++;     
 36 
 37                 if ( ((longword - lomagic) & himagic) != 0)
 38                 {
 39                         /* Which of the bytes was the zero?  If none of them were, it was
 40                            a misfire; continue the search.  */
 41 
 42                         const char *cp = (const char *) (longword_ptr - 1);
 43 
 44                         if (cp[0] == 0)
 45                                 return cp - str;
 46                         if (cp[1] == 0)
 47                                 return cp - str + 1;
 48                         if (cp[2] == 0)
 49                                 return cp - str + 2;
 50                         if (cp[3] == 0)
 51                                 return cp - str + 3;
 52                         if (sizeof (longword) > 4)
 53                         {
 54                                 if (cp[4] == 0)
 55                                         return cp - str + 4;
 56

补充：软件开发 , C语言 ,