The DOT Language

The DOT Language

          


The following is an abstract grammar defining the DOT language. Terminals are shown in bold font and nonterminals in italics. Literal characters are given in single quotes. Parentheses ( and ) indicate grouping when needed. Square brackets [ and ] enclose optional items. Vertical bars | separate alternatives.

    

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

        

            

            

            

        

    

graph : [ strict ] (graph | digraph) [ ID ] ‘{‘ stmt_list ‘}’
stmt_list : [ stmt [ ‘;’ ] [ stmt_list ] ]
stmt : node_stmt
  | edge_stmt
  | attr_stmt
  | ID ‘=’ ID
  | subgraph
attr_stmt : (graph | node | edge) attr_list
attr_list : ‘[‘ [ a_list ] ‘]’ [ attr_list ]
a_list : ID [ ‘=’ ID ] [ ‘,’ ] [ a_list ]
edge_stmt : (node_id | subgraph) edgeRHS [ attr_list ]
edgeRHS : edgeop (node_id | subgraph) [ edgeRHS ]
node_stmt : node_id [ attr_list ]
node_id : ID [ port ]
port : ‘:’ ID [ ‘:’ compass_pt ]
  | ‘:’ compass_pt
subgraph : [ subgraph [ ID ] ] ‘{‘ stmt_list ‘}’
compass_pt : (n | ne | e | se | s | sw | w | nw | c | _)

The keywords node, edge, graph, digraph, subgraph, and strict are case-independent. Note also that the allowed compass point values are not keywords, so these strings can be used elsewhere as ordinary identifiers and, conversely, the parser will actually accept any identifier.

An ID is one of the following:  

  • Any string of alphabetic ([a-zA-Z'200-'377]) characters, underscores ('_') or digits ([0-9]), not beginning with a digit;
  • a numeral [-]?(.[09]+  | [09]+(.[09]*)? );  
  • any double-quoted string ("…") possibly containing escaped  quotes (‘")1;
  • an HTML string (<…>).
  • An ID is just a string; the lack of quote characters in the first two forms is just for simplicity. There is no semantic difference between abc_2 and "abc_2", or between 2.34 and  "2.34". Obviously, to use a keyword as an ID, it must be quoted. Note that, in HTML strings, angle brackets must occur in matched pairs, and unescaped newlines are allowed. In addition, the content must be legal XML, so that the special XML escape sequences for ", &, <, and > may be necessary in order to embed these characters in attribute values or raw text.

    Both quoted strings and HTML strings are scanned as a unit, so any embedded comments will be treated as part of the strings.

    An edgeop is -> in directed graphs and -- in undirected graphs.

    An a_list clause of the form ID is equivalent to ID=true.

    The language supports C++-style comments: /* */ and //. In addition, a line beginning with a ‘#’ character is considered a line output from a C preprocessor (e.g., #  34 to indicate line 34 ) and discarded.

    Semicolons aid readability but are not required except in the rare case that a named subgraph with no body immediately preceeds an anonymous subgraph, since the precedence rules cause this sequence to be parsed as a subgraph with a heading and a body. Also, any amount of whitespace may be inserted between terminals.

    As another aid for readability, dot allows single logical lines to span multiple physical lines using the standard C convention of a backslash immediately preceding a newline character. In addition,  double-quoted strings can be concatenated using a ‘+’ operator. As HTML strings can contain newline characters, they do not support the concatenation operator.

    Subgraphs and Clusters

    Subgraphs play three roles in Graphviz. First, a subgraph can be used to  represent graph structure, indicating that certain nodes and edges should  be grouped together. This is the usual role for subgraphs  and typically specifies semantic information about the graph components.

    In the second role, a subgraph can provide a context for setting attributes. For example, a subgraph could specify that blue  is the default color for all nodes defined in it.  In the context of  graph drawing, a more interesting example is:  

    This (anonymous) subgraph specifies that the nodes A, B and C  should all be placed on the same rank if drawn using dot.

    The third role for subgraphs directly involves how the graph will be laid out by certain layout engines. If the name of  the subgraph begins with cluster, Graphviz notes the subgraph as  a special cluster subgraph. If supported, the layout engine will  do the layout so that the nodes belonging to the cluster are drawn together,  with the entire drawing of the cluster contained within a bounding rectangle.  Note that, for good and bad, cluster subgraphs are not part of the DOT language, but solely a syntactic convention adhered to by certain of the layout engines.

    Lexical and Semantic Notes

    If a default attribute is defined using a node,  edge, or  graph statement, or by an attribute assignment not attached to a node or edge, any object of the appropriate type defined afterwards will inherit this attribute value. This holds until the default attribute is set to a new value, from which point the new value is used. Objects defined before a default attribute is set will have an empty string value attached to the attribute once the default attribute definition is made.

    Note, in particular, that a subgraph receives the attribute settings of its parent graph at the time of its definition. This can be useful; for example, one can assign a font to the root graph and all subgraphs will also use the font. For some attributes, however, this property is undesirable. If one attaches a label to the root graph, it is probably not the desired effect to have the label used by all subgraphs. Rather than listing the graph attribute at the top of the graph, and the resetting the attribute as needed in the subgraphs, one can simple defer the attribute definition if the graph until the appropriate subgraphs have been defined.

    If an edge belongs to a cluster, its endpoints belong to that cluster. Thus, where you put an edge can effect a layout, as clusters are sometimes laid out recursively.

    There are certain restrictions on subgraphs and clusters. First, at present, the names of a graph and it subgraphs share the same namespace. Thus, each subgraph must have a unique name. Second, although nodes can belong to any number of subgraphs, it is assumed clusters form a strict hierarchy when viewed as subsets of nodes and edges.

    Character encodings

    The DOT language assumes at least the ascii character set. Quoted strings, both ordinary and HTML-like, may contain non-ascii characters. In most cases, these strings are uninterpreted: they simply serve as unique identifiers or values passed through untouched. Labels, however, are meant to be displayed, which requires that the software be able to compute the size of the text and determine the appropriate glyphs.  For this, it needs to know what character encoding is used.

    By default, DOT assumes the UTF-8 character encoding. It also accepts the Latin1 (ISO-8859-1) character set, assuming the input graph uses the charset attribute to  specify this. For graphs using other character sets, there are usually programs, such as iconv, which will translate from one character set to another.

    Another way to avoid non-ascii characters in labels is to use HTML entities for special characters. During label evaluation, these entities are translated into the underlying character. This table shows the supported entities, with their Unicode value, a typical glyph, and the HTML entity name. Thus, to include a lower-case Greek beta into a string, one can use the ascii sequence &beta;.  In general, one should only use entities that are allowed in the output character set, and for which there is a glyph in the font.

    用 Graphviz 可视化函数调用

    GNU 的入口和出口配置函数

    这里,我们是在函数入口和出口将函数的地址记录下来,这里是记录到了文件中,该文件在什么时候打开呢?
    gcc 的开发者也考虑过这个问题,它们为 main 函数的 constructor 函数和 destructor 函数提供了一些碰巧能够满足这个要求一些方法。
    constructor
    函数是在调用 main 函数之前调用的,而 destructor 函数则是在应用程序退出时调用的。
    要创建 constructor 和 destructor 函数,则需要声明两个函数,然后对这两个函数应用 constructordestructor 函数属性。
    constructor 函数中,会打开一个新的跟踪文件,分析数据的地址跟踪就是写入这个文件的;在 destructor 函数中,会关闭这个跟踪文件

    这两个特殊函数的声明如下:

     
    1. /* Constructor and Destructor Prototypes */
    2. void main_constructor( void )
    3.     __attribute__ ((no_instrument_function, constructor));
    4. void main_destructor( void )
    5.     __attribute__ ((no_instrument_function, destructor));

    在pvtrace中,定义如下:

     
    1. /* Output trace file pointer */
    2. static FILE *fp;
    3. void main_constructor( void )
    4. {
    5.   fp = fopen( "trace.txt""w" );
    6.   if (fp == NULL) exit(-1);
    7. }
    8. void main_deconstructor( void )
    9. {
    10.   fclose( fp );
    11. }

    好了,将上面四个函数(都包含在pvtrace代码包中的instrument.c文件中)和要分析的源代码一起编译就行了;
    运行编译后的程序就会在当前目录下生成 trace.txt 文件,文件内容类似于:
    ===== trace.txt =====
    E0x8048538
    E0x80484f4
    E0x80484a8
    X0x80484a8
    X0x80484f4
    X0x8048538
    =====================

    然后,使用pvtrace命令来分析该trace.txt, 分析的过程中涉及: 栈、符号两个概念,相关的文件为 stack.c symbols.c, 都比较简单,其中。
    将地址转换为函数名使用的是GNU的 addr2line 命令; 整个分析过程的结果生成一个 dot文件,里面只使用到了几个简单的dot文件的语法。

    最后,使用Graphviz 的dot命令将dot文件转换为图形。

    参考: http://www.ibm.com/developerworks/cn/linux/l-graphvis/

    雨霖铃

    关于GraphViz

    在学习C中的函数调用链的时候,了解了一下GraphViz, 可以使用PHP来写相关的程序,但是,看了一下,不是使用module来做的,而是在PHP中调用GraphViz相关的命令来完成的

    相关资料:
    http://www.graphviz.org/

    vimgrep

    在指定文件中找关键字:
    :vimgrep keyword file

    在指定目录下找关键字:
    :vimgrep keyword **

    在指定目录下的php文件中找关键字:
    :vimgrep keyword **/*.php

    查找所有匹配的关键字:
    :vimgrep /keyword/g **/*.php

    ——————————————
    :vimgrep 是vim来实现的,内置的,跨平台的搜索工具,但是效率上是差一些的,但是有很好的正则的支持
    :grep 是外部工具

    还有专门用来搜索的vim插件:http://www.vim.org/scripts/script.php?script_id=311

    关于memcacheq的几个命令

    1. 查看mq阻塞情况
    sh mq_watch.sh block 10.55.38.24 22202 qname

    2. 查看mq写入情况
    sh mq_watch.sh write 10.55.38.24 22202 qname

    3. 查看mq消费情况
    sh mq_watch.sh read 10.55.38.24 22202 qname

     

    mq_watch.sh

     

    vim的fuf插件

    vim的fuf插件查找目录下的文件还是非常好使的,但是,当我双击一个PHP文件,使用vim打开的使用,总提示l9 load失败,然后fuf插件就不能用。 于是,我把l9插件 和fuf插件重现安装了一下,还是不好使。后来把vim的install.exe 也执行了一遍,还是不好使。

    一个偶然的机会,我发现从快速启动栏启动vim的时候没有这种报错,而且fuf插件是好使的; 我就仔细观察了一下双击PHP文件时的报错信息,发现启动的vim是我的vim7.1, 而此前我看到过fuf插件是在vim7.2之后才好使的,其实,这时候问题已经基本明白了:我曾经是使用vim7.1的,升级到vim7.3后,vim7.1没有删除,启动栏里面启动的是vim7.3,这个没有疑问,而.php 文件关联的那个vim的版本我不太清楚了,而且比较可气的是,重新安装vim并没有修改这个关联,我手动将.php 文件关联的vim修改为vim7.3时,问题解决。

    这个问题我昨天折腾了1个多小时,这次却几分钟给搞定了,看来有时候不能硬来

    关于mongodb的db.stats()

    关于mongodb的db.stats() 的解释:

    其中:
    collections: 集合的个数
    objects: 对象的个数, 循环每个集合得到的,每个集合的记录数(nrecords)之和
    avgObjSize: 平均每个对象的大小, 通过 dataSize / objects 得到
    dataSize: 每个集合的dataSize之和
    storageSize: 每个集合的storageSize之和
    fileSize: db下面的物理存储文件的大小; 物理存储文件比实际的数据要大一些,这是是mongo的预分配机制

    这里最难理解的大概就是 storageSize了,它和dataSize的差异参看: http://blog.nosqlfan.com/html/2654.html

    相关代码db\dbcommands.cpp:

     

     


     

     

    关于mongodb

    1. 关于mongodb的遍历
    db.phpor.test.find();
    it

    如:

    2. mongodb可以在server端保存js,这样一些业务逻辑就可以使用js来写了,而且算法也不需要总是通过网络传递

    关于iframe的onload事件

    使用iframe来做post方式的跨域访问,通过iframe的onload事件来监听请求是否完成。
    原来以为iframe的onload的事件也有跨域的限制,测试了一下,是没有的,测试了IE、火狐、safari都很好使。