CSharp初體驗

2023-05-26 21:00:34

入門

初來乍到了解一門新的語言,它可能和熟悉的c/c++有不小差別,整體上需要首先了解下語法檔案的整體結構。例如,原始檔整體結構如何。

乍看CSharp原始檔(compile unit)的結構,官網主要是通過文字描述的整體結構,而下面的形式化語法,描述也不太符合自定向下這種型別的語法結構描述方法,這樣對於新手來了解這種語言的整體結構來說就有些困難。

好在有一個開源的dotgnu專案,該專案的官方檔案中顯示,專案已經在2012年正式廢棄(可能更早已經沒有更新了)。從工程的語法描述檔案來看,它還沒有涉及到lambda表示式這種重要語法功能的支援,不知道是因為專案啟動時暫時沒有支援,或者是啟動時CSharp還沒有這種語法功能。

As of December 2012, the DotGNU project has been decommissioned, until and unless a substantial new volunteer effort arises. The exception is the libjit component, which is now a separate libjit package.

dotgnu

儘管該專案比較久遠,但是它的語法描述是通過經典的yacc語法描述,這樣對於理解整體結構時最為直觀的。其中對於整體結構的描述大致如下。從這個描述來看,整個原始檔的結構頂層只能包含using、namespace、class、enum、struct、module、interface、delegate這些宣告。

///@file: DotGnu\pnet\cscc\csharp\cs_grammar.y
/*
 * Outer level of the C# input file.
 */

CompilationUnit
	: /* empty */	{
				/* The input file is empty */
				CCTypedWarning("-empty-input",
							   "file contains no declarations");
				ResetState();
			}
	| OuterDeclarationsRecoverable		{
				/* Check for empty input and finalize the parse */
				if(!HaveDecls)
				{
					CCTypedWarning("-empty-input",
								   "file contains no declarations");
				}
				ResetState();
			}
	| OuterDeclarationsRecoverable NonOptAttributes	{
				/* A file that contains declarations and assembly attributes */
				if($2)
				{
					InitGlobalNamespace();
					CCPluginAddStandaloneAttrs
						(ILNode_StandaloneAttr_create
							((ILNode*)CurrNamespaceNode, $2));
				}
				ResetState();
			}
	| NonOptAttributes	{
				/* A file that contains only assembly attributes */
				if($1)
				{
					InitGlobalNamespace();
					CCPluginAddStandaloneAttrs
						(ILNode_StandaloneAttr_create
							((ILNode*)CurrNamespaceNode, $1));
				}
				ResetState();
			}
	;

/*
 * Note: strictly speaking, declarations should be ordered so
 * that using declarations always come before namespace members.
 * We have relaxed this to make error recovery easier.
 */
OuterDeclarations
	: OuterDeclaration
	| OuterDeclarations OuterDeclaration
	;

OuterDeclaration
	: UsingDirective
	| NamespaceMemberDeclaration
	| error			{
				/*
				 * This production recovers from errors at the outer level
				 * by skipping invalid tokens until a namespace, using,
				 * type declaration, or attribute, is encountered.
				 */
			#ifdef YYEOF
				while(yychar != YYEOF)
			#else
				while(yychar >= 0)
			#endif
				{
					if(yychar == NAMESPACE || yychar == USING ||
					   yychar == PUBLIC || yychar == INTERNAL ||
					   yychar == UNSAFE || yychar == SEALED ||
					   yychar == ABSTRACT || yychar == CLASS ||
					   yychar == STRUCT || yychar == DELEGATE ||
					   yychar == ENUM || yychar == INTERFACE ||
					   yychar == '[')
					{
						/* This token starts a new outer-level declaration */
						break;
					}
					else if(yychar == '}' && CurrNamespace.len != 0)
					{
						/* Probably the end of the enclosing namespace */
						break;
					}
					else if(yychar == ';')
					{
						/* Probably the end of an outer-level declaration,
						   so restart the parser on the next token */
						yychar = YYLEX;
						break;
					}
					yychar = YYLEX;
				}
			#ifdef YYEOF
				if(yychar != YYEOF)
			#else
				if(yychar >= 0)
			#endif
				{
					yyerrok;
				}
				NestingLevel = 0;
			}
	;
///....
OptNamespaceMemberDeclarations
	: /* empty */
	| OuterDeclarations
	;

NamespaceMemberDeclaration
	: NamespaceDeclaration
	| TypeDeclaration			{ CCPluginAddTopLevel($1); }
	;

TypeDeclaration
	: ClassDeclaration			{ $$ = $1; }
	| ModuleDeclaration			{ $$ = $1; }
	| StructDeclaration			{ $$ = $1; }
	| InterfaceDeclaration		{ $$ = $1; }
	| EnumDeclaration			{ $$ = $1; }
	| DelegateDeclaration		{ $$ = $1; }
	;

roslyn

微軟官方開源了CSharp的實現,所以最標準的解釋應該是來自微軟官方程式碼。遺憾的是這個工程是使用CSharp開發的,所以專案內對於語法的解析也不是通過yacc檔案描述,而是手工實現的一個編譯器解析。猜測程式碼應該位於

///@file: roslyn\src\Compilers\CSharp\Portable\Parser

        internal CompilationUnitSyntax ParseCompilationUnitCore()
        {
            SyntaxToken? tmp = null;
            SyntaxListBuilder? initialBadNodes = null;
            var body = new NamespaceBodyBuilder(_pool);
            try
            {
                this.ParseNamespaceBody(ref tmp, ref body, ref initialBadNodes, SyntaxKind.CompilationUnit);

                var eof = this.EatToken(SyntaxKind.EndOfFileToken);
                var result = _syntaxFactory.CompilationUnit(body.Externs, body.Usings, body.Attributes, body.Members, eof);

                if (initialBadNodes != null)
                {
                    // attach initial bad nodes as leading trivia on first token
                    result = AddLeadingSkippedSyntax(result, initialBadNodes.ToListNode());
                    _pool.Free(initialBadNodes);
                }

                return result;
            }
            finally
            {
                body.Free(_pool);
            }
        }
            private void ParseNamespaceBody(
            [NotNullIfNotNull(nameof(openBraceOrSemicolon))] ref SyntaxToken? openBraceOrSemicolon,
            ref NamespaceBodyBuilder body,
            ref SyntaxListBuilder? initialBadNodes,
            SyntaxKind parentKind)
        {
            // "top-level" expressions and statements should never occur inside an asynchronous context
            Debug.Assert(!IsInAsync);

            bool isGlobal = openBraceOrSemicolon == null;

            var saveTerm = _termState;
            _termState |= TerminatorState.IsNamespaceMemberStartOrStop;
            NamespaceParts seen = NamespaceParts.None;
            var pendingIncompleteMembers = _pool.Allocate<MemberDeclarationSyntax>();
            bool reportUnexpectedToken = true;

            try
            {
                while (true)
                {
                    switch (this.CurrentToken.Kind)
                    {
                        case SyntaxKind.NamespaceKeyword:
                            // incomplete members must be processed before we add any nodes to the body:
                            AddIncompleteMembers(ref pendingIncompleteMembers, ref body);

                            var attributeLists = _pool.Allocate<AttributeListSyntax>();
                            var modifiers = _pool.Allocate();

                            body.Members.Add(adjustStateAndReportStatementOutOfOrder(ref seen, this.ParseNamespaceDeclaration(attributeLists, modifiers)));

                            _pool.Free(attributeLists);
                            _pool.Free(modifiers);

                            reportUnexpectedToken = true;
                            break;

                        case SyntaxKind.CloseBraceToken:
                            // A very common user error is to type an additional } 
                            // somewhere in the file.  This will cause us to stop parsing
                            // the root (global) namespace too early and will make the 
                            // rest of the file unparseable and unusable by intellisense.
                            // We detect that case here and we skip the close curly and
                            // continue parsing as if we did not see the }
                            if (isGlobal)
                            {
                                // incomplete members must be processed before we add any nodes to the body:
                                ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);

                                var token = this.EatToken();
                                token = this.AddError(token,
                                    IsScript ? ErrorCode.ERR_GlobalDefinitionOrStatementExpected : ErrorCode.ERR_EOFExpected);

                                this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, token);
                                reportUnexpectedToken = true;
                                break;
                            }
                            else
                            {
                                // This token marks the end of a namespace body
                                return;
                            }

                        case SyntaxKind.EndOfFileToken:
                            // This token marks the end of a namespace body
                            return;

                        case SyntaxKind.ExternKeyword:
                            if (isGlobal && !ScanExternAliasDirective())
                            {
                                // extern member or a local function
                                goto default;
                            }
                            else
                            {
                                // incomplete members must be processed before we add any nodes to the body:
                                ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);

                                var @extern = ParseExternAliasDirective();
                                if (seen > NamespaceParts.ExternAliases)
                                {
                                    @extern = this.AddErrorToFirstToken(@extern, ErrorCode.ERR_ExternAfterElements);
                                    this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, @extern);
                                }
                                else
                                {
                                    body.Externs.Add(@extern);
                                    seen = NamespaceParts.ExternAliases;
                                }

                                reportUnexpectedToken = true;
                                break;
                            }

                        case SyntaxKind.UsingKeyword:
                            if (isGlobal && (this.PeekToken(1).Kind == SyntaxKind.OpenParenToken || (!IsScript && IsPossibleTopLevelUsingLocalDeclarationStatement())))
                            {
                                // Top-level using statement or using local declaration
                                goto default;
                            }
                            else
                            {
                                parseUsingDirective(ref openBraceOrSemicolon, ref body, ref initialBadNodes, ref seen, ref pendingIncompleteMembers);
                            }

                            reportUnexpectedToken = true;
                            break;

                        case SyntaxKind.IdentifierToken:
                            if (this.CurrentToken.ContextualKind != SyntaxKind.GlobalKeyword || this.PeekToken(1).Kind != SyntaxKind.UsingKeyword)
                            {
                                goto default;
                            }
                            else
                            {
                                parseUsingDirective(ref openBraceOrSemicolon, ref body, ref initialBadNodes, ref seen, ref pendingIncompleteMembers);
                            }

                            reportUnexpectedToken = true;
                            break;

                        case SyntaxKind.OpenBracketToken:
                            if (this.IsPossibleGlobalAttributeDeclaration())
                            {
                                // incomplete members must be processed before we add any nodes to the body:
                                ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);

                                var attribute = this.ParseAttributeDeclaration();
                                if (!isGlobal || seen > NamespaceParts.GlobalAttributes)
                                {
                                    RoslynDebug.Assert(attribute.Target != null, "Must have a target as IsPossibleGlobalAttributeDeclaration checks for that");
                                    attribute = this.AddError(attribute, attribute.Target.Identifier, ErrorCode.ERR_GlobalAttributesNotFirst);
                                    this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, attribute);
                                }
                                else
                                {
                                    body.Attributes.Add(attribute);
                                    seen = NamespaceParts.GlobalAttributes;
                                }

                                reportUnexpectedToken = true;
                                break;
                            }

                            goto default;

                        default:
                            var memberOrStatement = isGlobal ? this.ParseMemberDeclarationOrStatement(parentKind) : this.ParseMemberDeclaration(parentKind);
                            if (memberOrStatement == null)
                            {
                                // incomplete members must be processed before we add any nodes to the body:
                                ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBraceOrSemicolon, ref body, ref initialBadNodes);

                                // eat one token and try to parse declaration or statement again:
                                var skippedToken = EatToken();
                                if (reportUnexpectedToken && !skippedToken.ContainsDiagnostics)
                                {
                                    skippedToken = this.AddError(skippedToken,
                                        IsScript ? ErrorCode.ERR_GlobalDefinitionOrStatementExpected : ErrorCode.ERR_EOFExpected);

                                    // do not report the error multiple times for subsequent tokens:
                                    reportUnexpectedToken = false;
                                }

                                this.AddSkippedNamespaceText(ref openBraceOrSemicolon, ref body, ref initialBadNodes, skippedToken);
                            }
                            else if (memberOrStatement.Kind == SyntaxKind.IncompleteMember && seen < NamespaceParts.MembersAndStatements)
                            {
                                pendingIncompleteMembers.Add(memberOrStatement);
                                reportUnexpectedToken = true;
                            }
                            else
                            {
                                // incomplete members must be processed before we add any nodes to the body:
                                AddIncompleteMembers(ref pendingIncompleteMembers, ref body);

                                body.Members.Add(adjustStateAndReportStatementOutOfOrder(ref seen, memberOrStatement));
                                reportUnexpectedToken = true;
                            }
                            break;
                    }
                }
            }
            finally
            {
                _termState = saveTerm;

                // adds pending incomplete nodes:
                AddIncompleteMembers(ref pendingIncompleteMembers, ref body);
                _pool.Free(pendingIncompleteMembers);
            }

            MemberDeclarationSyntax adjustStateAndReportStatementOutOfOrder(ref NamespaceParts seen, MemberDeclarationSyntax memberOrStatement)
            {
                switch (memberOrStatement.Kind)
                {
                    case SyntaxKind.GlobalStatement:
                        if (seen < NamespaceParts.MembersAndStatements)
                        {
                            seen = NamespaceParts.MembersAndStatements;
                        }
                        else if (seen == NamespaceParts.TypesAndNamespaces)
                        {
                            seen = NamespaceParts.TopLevelStatementsAfterTypesAndNamespaces;

                            if (!IsScript)
                            {
                                memberOrStatement = this.AddError(memberOrStatement, ErrorCode.ERR_TopLevelStatementAfterNamespaceOrType);
                            }
                        }

                        break;

                    case SyntaxKind.NamespaceDeclaration:
                    case SyntaxKind.FileScopedNamespaceDeclaration:
                    case SyntaxKind.EnumDeclaration:
                    case SyntaxKind.StructDeclaration:
                    case SyntaxKind.ClassDeclaration:
                    case SyntaxKind.InterfaceDeclaration:
                    case SyntaxKind.DelegateDeclaration:
                    case SyntaxKind.RecordDeclaration:
                    case SyntaxKind.RecordStructDeclaration:
                        if (seen < NamespaceParts.TypesAndNamespaces)
                        {
                            seen = NamespaceParts.TypesAndNamespaces;
                        }
                        break;

                    default:
                        if (seen < NamespaceParts.MembersAndStatements)
                        {
                            seen = NamespaceParts.MembersAndStatements;
                        }
                        break;
                }

                return memberOrStatement;
            }

            void parseUsingDirective(
                ref SyntaxToken? openBrace,
                ref NamespaceBodyBuilder body,
                ref SyntaxListBuilder? initialBadNodes,
                ref NamespaceParts seen,
                ref SyntaxListBuilder<MemberDeclarationSyntax> pendingIncompleteMembers)
            {
                // incomplete members must be processed before we add any nodes to the body:
                ReduceIncompleteMembers(ref pendingIncompleteMembers, ref openBrace, ref body, ref initialBadNodes);

                var @using = this.ParseUsingDirective();
                if (seen > NamespaceParts.Usings)
                {
                    @using = this.AddError(@using, ErrorCode.ERR_UsingAfterElements);
                    this.AddSkippedNamespaceText(ref openBrace, ref body, ref initialBadNodes, @using);
                }
                else
                {
                    body.Usings.Add(@using);
                    seen = NamespaceParts.Usings;
                }
            }
        }

烏龍

因為這個這種手撕的編譯器程式碼看起來過於晦澀,又回頭看了下CSharp的官方語言描述,其中是有編譯單元入口描述的,只是隱藏的位置比較深,所以剛開始沒看到([流汗]),這個最頂層的語法結構就是compilation_unit,從這個依次向下可以看到對於該結構的逐層描述和細化。從這個語法描述結構來看,最頂層的結構的確只能寶庫using開始的結構,然後就是namespace,以及type_declaration。

// Source: §14.2 Compilation units
compilation_unit
    : extern_alias_directive* using_directive* global_attributes?
      namespace_member_declaration*
    ;
    
// Source: §22.3 Attribute specification
global_attributes
    : global_attribute_section+
    ;

// Source: §14.6 Namespace member declarations
namespace_member_declaration
    : namespace_declaration
    | type_declaration
    ;

// Source: §14.7 Type declarations
type_declaration
    : class_declaration
    | struct_declaration
    | interface_declaration
    | enum_declaration
    | delegate_declaration
    ;
// Source: §14.3 Namespace declarations
namespace_declaration
    : 'namespace' qualified_identifier namespace_body ';'?
    ;
    
global_attribute_section
    : '[' global_attribute_target_specifier attribute_list ']'
    | '[' global_attribute_target_specifier attribute_list ',' ']'
    ;
    

lambda表示式

在眾多表示式中,這種lambda是一種比較順手的語法結構,經在很多專案中出鏡率還是很高的,所以還是要看下這個語法。在這個語法描述中,可以看到,關鍵的是"=>"這個語法結構,在這個結構之前,可以使用括弧(explicit_anonymous_function_signature),也可以不使用(implicit_anonymous_function_signature)。這種語法其實很難使用yacc語法描述,因為它對上下文的依賴非常強。

// Source: §12.19.1 General
lambda_expression
    : 'async'? anonymous_function_signature '=>' anonymous_function_body
    ;
anonymous_function_signature
    : explicit_anonymous_function_signature
    | implicit_anonymous_function_signature
    ;

explicit_anonymous_function_signature
    : '(' explicit_anonymous_function_parameter_list? ')'
    ;
implicit_anonymous_function_signature
    : '(' implicit_anonymous_function_parameter_list? ')'
    | implicit_anonymous_function_parameter
    ;

implicit_anonymous_function_parameter_list
    : implicit_anonymous_function_parameter
      (',' implicit_anonymous_function_parameter)*
    ;

implicit_anonymous_function_parameter
    : identifier
    ;
    

其它=>

搜尋語法中的這個'=>',可以發現除了lambda表示式之外,還有其他的場景使用,例如local_function_body。同樣是這種語法結構,那麼如何區域分是lambda表示式還是local_function呢?其實看下語法的上下文就可以看到,localfunction中'=>'前面是需要有型別(return_type)宣告,而lambda表示式中的implicit_anonymous_function_parameter是作為expression來出現的,而顧名思義,expression表示式的前面是不可能出現type這種型別字首引導的。

這裡再次看到,CSharp這種語言是很難通過yacc這種通用的語法工具來描述。

// Source: §13.6.4 Local function declarations
local_function_declaration
    : local_function_header local_function_body
    ;

local_function_header
    : local_function_modifier* return_type identifier type_parameter_list?
        ( formal_parameter_list? ) type_parameter_constraints_clause*
    ;
local_function_modifier
    : 'async'
    | 'unsafe'
    ;

local_function_body
    : block
    | '=>' null_conditional_invocation_expression ';'
    | '=>' expression ';'
    ;

推論

全域性變數

一個直接的推論是:不存在類似於C/C++中「全域性變數」的概念

main函數

由於不存在全域性變數或者函數,所以也不存在類似於C/C++的全域性main函數入口,所以整個應用(application)的入口只能位於某個class(不特定)內部,語言規定作為必須宣告為static public型別。

what if no namespace

從語法上看,namespace並不是必須的,如果沒有把宣告放在namespace中,那麼和C++一樣,宣告會放在全域性globalnamespace中。

栗子

但是,按照語法規範寫的程式碼並不代表就是合法的。例如下面根據語法規範寫的程式碼,大部分都是錯誤:-(——程式設計好難啊……

using System;

//名稱空間不能直接包含欄位或方法之類的成員
int leela = 1;

namespace harry
{
	class harry
	{
		public static int fry(int x, int y)
		{
			int localfunc() => x + y;
			//只有 assignment、call、increment、decrement 和 new 物件表示式可用作語句
			z => z + 1;
			//error CS0149: 應輸入方法名稱
			int dd = ((int a) => a + 1)(1);
			return localfunc();
		}
		public static int Main()
		{
			return fry(3, 7);
		}
	};
}


namespace tsecer
{
	//名稱空間不能直接包含欄位或方法之類的成員
	void tsecer(){}
}