A non backtracking Top-down parser

Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Grammar and Algorithm }
Mooly Sagiv and Roman Manevich School of Computer Science
Top-Down Parsing.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
1 LR parsing techniques SLR [not in the book] Simple LR parsing Easy to implement, not strong enough Uses LR[0] items Canonical LR Larger parser but.
1 The Parser Its job: Check and verify syntax based on specified syntax rules Report errors Build IR Good news the process can be automated.
1 Chapter 4: Top-Down Parsing. 2 Objectives of Top-Down Parsing an attempt to find a leftmost derivation for an input string. an attempt to construct.
Professor Yihjia Tsai Tamkang University
Top-Down Parsing.
1 CSCE 531 Spring 2006 Lecture 7 Predictive Parsing Topics Review Top Down Parsing First Follow LL [1] Table construction Readings: 4.4 Homework: Program.
CPSC 388 Compiler Design and Construction
CSC3315 [Spring 2009]1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
COP4020 Programming Languages Computing LL[1] parsing table Prof. Xin Yuan.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL[1] parsing LL[1] parsing algorithm First.
410/510 1 of 21 Week 2 Lecture 1 Bottom Up [Shift reduce, LR parsing] SLR, LR[0] parsing SLR parsing table Compiler Construction.
Top-Down Parsing - recursive descent - predictive parsing
4 4 [c] parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
Chapter 5 Top-Down Parsing.
# 1 CMPS 450 Parsing CMPS 450 J. Moloney. # 2 CMPS 450 Check that input is well-formed Build a parse tree or similar representation of input Recursive.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
CS 153 A little bit about LR Parsing. Background Weve seen three ways to write parsers: By hand, typically recursive descent Using parsing combinators.
Lesson 5 CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Exercise 1 A ::= B EOF B ::= | B B | [B] Tokens: EOF, [, ] Generate constraints and compute nullable and first for this grammar. Check whether first.
Pembangunan Kompilator. The parse tree is created top to bottom. Top-down parser Recursive-Descent Parsing Backtracking is needed [If a choice.
BİL 744 Derleyici Gerçekleştirimi [Compiler Design]1 Top-Down Parsing The parse tree is created top to bottom. Top-down parser Recursive-Descent Parsing.
Parsing Top-Down.
LL[1] Parser. What does LL signify ? The first L means that the scanning takes place from Left to right. The first L means that the scanning takes place.
Parsing Part II [Top-down parsing, left-recursion removal] Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing Recursive Descent & LL[1] Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 Spring Where Are We? Source code: if [b==0] a = Hi; Token Stream: if [b == 0] a = Hi; Abstract.
Lecture 3: Parsing CS 540 George Mason University.
1 Context free grammars Terminals Nonterminals Start symbol productions E --> E + T E --> E T E --> T T --> T * F T --> T / F T --> F F --> [F]
1 Nonrecursive Predictive Parsing It is possible to build a nonrecursive predictive parser This is done by maintaining an explicit stack.
Top-down Parsing. 2 Parsing Techniques Top-down parsers [LL[1], recursive descent] Start at the root of the parse tree and grow toward leaves Pick a production.
Top-Down Parsing.
Parsing methods: Top-down parsing Bottom-up parsing Universal.
1 Topic #4: Syntactic Analysis [Parsing] CSC 338 Compiler Design and implementation Dr. Mohamed Ben Othman [ ]
Chapter 2 [part] + Chapter 4: Syntax Analysis S. M. Farhad 1.
UMBC CSEE 1 Chapter 4 Chapter 4 [b] parsing.
COMP 3438 Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Parsing COMP 3002 School of Computer Science. 2 The Structure of a Compiler syntactic analyzer code generator program text interm. rep. machine code tokenizer.
Error recovery in predictive parsing An error is detected during the predictive parsing when the terminal on top of the stack does not match the next input.
Programming Languages Translator
Context free grammars Terminals Nonterminals Start symbol productions
Parsing IV Bottom-up Parsing
Compilers Welcome to a journey to CS419 Lecture15: Syntax Analysis:
Introduction to Top Down Parser
Top-down parsing cannot be performed on left recursive grammars.
FIRST and FOLLOW Lecture 8 Mon, Feb 7, 2005.
Top-Down Parsing.
3.2 Language and Grammar Left Factoring Unclear productions
Top-Down Parsing CS 671 January 29, 2008.
Lecture 7 Predictive Parsing
Top-Down Parsing Identify a leftmost derivation for an input string
Top-Down Parsing The parse tree is created top to bottom.
Chapter 4 Top Down Parser.
LL PARSER The construction of a predictive parser is aided by two functions associated with a grammar called FIRST and FOLLOW. These functions allows us.
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Parsing IV Bottom-up Parsing
Computing Follow[A] : All Non-Terminals
Lecture 7 Predictive Parsing
Nonrecursive Predictive Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Predictive Parsing Program
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive parser is characterized by its ability to choose the production to apply solely on the basis of the next input symbol and the current non terminal being processed.

Predictive Parsers Like recursive-descent but parser can predict which production to use By looking at the next few tokens No backtracking Predictive parsers accept LL[k] grammars L means left-to-right scan of input L means leftmost derivation k means predict based on k tokens of lookahead In practice, LL[1] is used

contd.. To enable this, the grammar must take a particular form. We call such a grammar LL[1]. The first "L means we scan the input from left to right; the second "L" means we create a leftmost derivation; and the 1 means one input symbol of lookahead. Informally, an LL[1] has no left-recursive productions and has been left-factored.

Contd Note also that there exist many grammars that cannot be modified to become LL[1]. In such cases, another parsing technique must be employed, or special rules must be embedded into the predictive parser.

Recursive Descent The first technique for implementing a predictive parser is called recursive-descent. A recursive-descent parser consists of several small functions, one for each non terminal in the grammar. As we parse a sentence, we call the functions that correspond to the left side non terminal of the productions we are applying. If these productions are recursive, we end up calling the functions recursively.

Calculating first sets To calculate First[u] where u has the form X1X2...Xn, a] If X1 is a terminal, add X1 to First[u]. b] Else X1 is a nonterminal, so add First[X1] - ε to First[u]. a. If X1 is a nullable nonterminal, i.e., X1 =>* ε, add First[X2] - ε to First[u]. Furthermore, if X2 can also go to ε, then add First[X3] - ε and so on, through all Xn until the first non-nullable symbol is encountered. b. If X1X2...Xn =>* ε, add ε to the first set.

Calculating follow sets For each non terminal in the grammar:- 1. Place EOF in Follow[S] where S is the start symbol and EOF is the input's right end marker. The end marker might be end of file, newline, or a special symbol, whatever is the expected end of input indication for this grammar. We will typically use $ as the end marker. 2. For every production A > uBv where u and v are any string of grammar symbols and B is a non terminal, everything in First[v] except ε is placed in Follow[B]. 3. For every production A > uB, or a production A > uBv where First[v] contains ε [i.e.v is nullable], then everything in Follow[A] is added to Follow[B].

Here is a complete example of first and follow set computation, starting with this grammar: S > AB A > Ca | ε B > BaAC | c C > b | ε Notice we have a left-recursive production that must be fixed if we are to use LL[1] parsing: B > BaAC | c becomes B > cB' B' > aACB' | ε The new grammar is: S > AB A > Ca | ε B > cB' B' > aACB' | ε C > b | ε

It helps to first compute the nullable set [i.e., those non terminals X that X =>* ε], since you need to refer to the nullable status of various non terminals when computing the first and follow sets: Nullable [G] = {A B' C}

The first sets for each non terminal are: First[C] = {b ε} First[B'] = {a ε} First[B] = {c} First[A] = {b a ε} Start with First[C] - ε, add a [since C is nullable] and ε [since A itself is nullable] First[S] = {b a c} Start with First[A] - ε, add First[B] [since A is nullable]. We dont add ε [since S itself is not-nullable A can go away, but B cannot]

To compute the follow sets, take each non terminal and go through all the right-side productions that the non terminal is in, matching to the steps given earlier: Follow[S] = {$} S doesnt appear in the right hand side of any productions. We put $ in the follow set because S is the start symbol. Follow[B] = {$} B appears on the right hand side of the S > AB production. Its follow set is the same as S. Follow[B'] = {$} B' appears on the right hand side of two productions. The B' > aACB production tells us its follow set includes the follow set of B', which is tautological. From B > cB', we learn its follow set is the same as B.

cont;d. Follow[C] = {a $} C appears in the right hand side of two productions. The production A > Ca tells us a is in the follow set. From B' > aACB', we add the First[B'] which is just a again. Because B' is nullable, we must also add Follow[B'] which is $. Follow[A] = {c b a $} A appears in the right hand side of two productions. From S > AB we add First[B] which is just c. B is not nullable. From B' > aACB', we add First[C] which is b. Since C is nullable, so we also include First[B'] which is a. B' is also nullable, so we include Follow[B'] which adds $.

Example Recall the grammar E T X X + E | ε T [ E ] | int Y Y * T | ε First sets First[ [ ] = { [ } First[ T ] = {int, [ } First[ ] ] = { ] } First[ E ] = {int, [ } First[ int ] = { int } First[ X ] = {+, ε } First[ + ] = { + } First[ Y ] = {*, ε } First[ * ] = { * }

Example Recall the grammar E T X X + E | ε T [ E ] | int Y Y * T | ε Follow sets Follow[ + ] = { int, [ } Follow[ * ] = { int, [ } Follow[ [ ] = { int, [ } Follow[ E ] = {], $} Follow[ X ] = {$, ] } Follow[ T ] = {+, ], $} Follow[ ] ] = {+, ], $} Follow[ Y ] = {+, ], $} Follow[ int] = {*, +, ], $}

Table-driven LL[1] Parsing In a recursive-descent parser, the production information is embedded in the individual parse functions for each non terminal and the run-time execution stack is keeping track of our progress through the parse. There is another method for implementing a predictive parser that uses a table to store that production along with an explicit stack to keep track of where we are in the parse.

Contd This grammar for add/multiply expressions is already set up to handle precedence and associativity: E > E + T | T T > T * F | F F > [E] | int After removal of left recursion, we get: E > TE' E' > + TE' | ε T > FT' T' > * FT' | ε F > [E] | int

E > TE' E' > + TE' | ε T > FT' T' > * FT' | ε F > [E] | int First[E] = First[T] = First[F] = { [, int } First[T'] = { *, ε } First[E'] = { +, ε } Follow[E] = Follow[E'] = { $, ] } Follow[T] = Follow[T'] = { +, $ ] } Follow[F] = { *, +, $ ] }

Contd. A predictive parser behaves as follows. Lets assume the input string is * 5. Parsing begins in the start state of the symbol E and moves to the next state. This transition is marked with a T, which sends us to the start state for T. This in turn, sends us to the start state for F. F has only terminals, so we read a token from the input string. It must either be an open parenthesis or an integer in order for this parse to be valid. We consume the integer token, and thus we have hit a final state in the F transition diagram, so we return to where we came from which is the T diagram;

we have just finished processing the F non terminal. We continue with T', and go to that start state. The current lookahead is + which doesnt match the * required by the first production, but + is in the follow set for T' so we match the second production which allows T' to disappear entirely. We finish T and return to T, where we are also in a final state. We return to the E diagram where we have just finished processing the T. We move on to E', and so on.

A table-driven predictive parser uses a stack to store the productions to which it must return. A parsing table stores the actions the parser should take based on the input token and what value is on top of the stack. $ is the end of input symbol.

Input/Top of parse Stack int+*[]$ EE> TE' EE '> +TE'E '> ε TT> FT' TT '>εT '> *FT'T '> ε FF> intF> [E]

LL[1] Grammars: Properties These predictive top-down techniques [either recursive-descent or table-driven] require a grammar that is LL[1]. One fully-general way to determine if a grammar is LL[1] is to build the table and see if you have conflicts. In some cases, you will be able to determine that a grammar is or isn't LL[1] via a shortcut [such as identifying obvious left-factors].

To give a formal statement of what is required for a grammar to be LL[1]: No ambiguity No left recursion A grammar G is LL[1] iff whenever A > u | v are two distinct productions of G, the following conditions hold: for no terminal a do both u and v derive strings beginning with a [i.e., first sets are disjoint] at most one of u and v can derive the empty string if v =>* ε then u does not derive any string beginning with a terminal in Follow[A] [i.e., first and follow must be disjoint if nullable]

Error Detection and Recovery Panic-mode error recovery -is a simple technique that just bails out of the current construct, looking for a safe symbol at which to restart parsing. The parser just discards input tokens until it finds what is called a synchronizing token. The set of synchronizing tokens are those that we believe confirm the end of the invalid statement and allow us to pick up at the next piece of code.

Thank You !!!!!!

Feedback: Privacy Policy Feedback
Do Not Sell My Personal Information
About project: SlidePlayer Terms of Service

© 2022 SlidePlayer.com Inc. All rights reserved.

Video liên quan

Chủ Đề