Commit e04b2803 authored by Isaac Oscar Gariano's avatar Isaac Oscar Gariano

Implemented lots of nice Syntax Sugar, also fixed broken detection of syntax errors.

parent 4f5d6bf2
This diff is collapsed.
# muc - A mu compiler
Translates MuIR in text form and creates a correspondinng boot image, or generates C code that does the same.
## Preqrequisites:
* You will need a C++14 compiler (tested with g++ 5.4.0)
......@@ -10,7 +11,7 @@
cp -r $antlr4/runtime/Cpp/run/usr/local/lib/* lib/
* If you wan't to compile directly (as opposed to output C files) you'll need mu, place your `libmu*` files in `./lib`
also set the environement variable MU_ZEBU is to the locatation of the mu-impl-fast source code
also set the environement variable MU_ZEBU to the locatation of the mu-impl-fast source code
## C Code prerequisites
To run the generated C code you will need:
......@@ -25,13 +26,60 @@ Run: `LD_LIBRARY_PATH=./lib muc -r [-f primordial-function-name] bundle1.mu... b
Which will compile the bundles (there must be at least one) and generate a boot image in the file `bootimage`
Use `-c` instead of `-r` and it won't actually do that, instead it will print C code to stdout that will have the same effect once compiled and run.
Mu's output files will be written to the same folder that contains `bootimage`.
## Notes
Use `-s` instead of `-r` to do syntax checking of your input files and nothing else.
## Syntax
The syntax is specified in `UIR.g4`, all the files in `./parser` are generated from this file.
The syntax is a super set of the reference implementations syntax and the syntax used in the mu-spec.
Note: you may now specified the function signature before the version in a `.funcdef`, it is also now optional.
It adds the following 'syntax sugar':
* comments, C++ style `// ... ` and C-style `/* ... */`.
* The signature in a `.funcdef` is optional, it may also be specified before the `VERSION` name.
* If there is a signature in a `.funcdef` and no previouse corresponding `.funcdecl`, one will be created for you.
* the version name in a `.funcdef` is optional, if absent a local one (starting with a `%`) will be automatically generated
(e.g. you can write `.funcdef @foo<@sig> { ... }`, instead of `.fundecl @foo<@sig>` and `.funcdef @foo VERSION %__1 {...}`)
* Types can be inlined (e.g. `.global @foo <int<32>>`, instead of `.typedef @int32 = int<32>`, `.global @foo <@int32>`)
* Function signatures can be inlined (e.g. `.funcdecl @foo <(int<32>)->(int<32>)>`).
* Constants can be inlined with the syntax `<type>ctor` (equivelent to `.const @__anon <type> = ctor`).
* A bundle may be given a name with the directive `.bundle @name` (the `@` is optional), it takes effect only for the following lines,
the name can be changed again with another `.bundle` directive. Muc will automatically generate a name if there isn't one.
(note: the name is only used in forming local names (see below), it is not passed to the Mu API).
* A top level declaration may be given a local name, in which case it is prefixed with the name of the bundle followed by a `.`.
(e.g. `.typedef %void = void`, instead of `.typedef @__bundle_1.foo`). (Note: you can give a `.const` a name starting with `%`, however
when refering it with `%`, if there is a local name more recently defined with a `%` (such as a block paramter) that will have priority.
* Top level declarations with a global name starting with an '_' are not passed to the whitelist of make bootimage (such as names local to unamed bundles)
* You may ommit the `@` or `%` in front of names, where the name `foo` is interpreted according to the following rules:
* if the name could be interpreted as a keyword/part of the syntax, it is (e.g. `.global v <void>` creates a new anonymous type with value `void`, it does not refer to `@void`)
* if the name is used for a 'value' (an operand of an instruction) and `%foo` was declared, then it is interpreted as `%foo`
* if it is used where the name of a top-level declaration is expected, it is interepreted as `@foo`
* otherwise it is interpereted as `%foo`
Notes:
A new declaration will be generated for each unique inline declaration. In addition, they will not be passed to the white list of boot images.
An inline declaration will never have the same ID as a non-inline declaration.
e.g.
.typedef @int32 = int<32>
.funcsig @foo = (int<32>)->(int<32>)
Declares 2 types `@int32` and an anonymous `int<32>`.
If you have multiple top level definitions (a typedef, funcsig, const, global, funcdecl or expose) with the same name (even in different bundles),
futured definitions will overdie previouse ones.
## Limitations
Muc will only check for correct syntax, it is up to you to ensure your input is semantically valid.
The files muapi.h and mu-fastimpl.h where taken from 'mu/mu-impl-fast' they are kept up to date on each commit of this repo.
If they are changed (or their behaviour changes), muc may not work correctly.
\ No newline at end of file
The files muapi.h and mu-fastimpl.h where taken from 'mu/mu-impl-fast' (master branch) they are kept up to date on each commit of this repo.
If they are changed (or their behaviour changes), muc may not work correctly.
If you are declaring a `.const` with an integer literal, the type must be declared before the literal
(i.e. it must be declared inline like `.const @foo <int<32>> = 3` or be declared in a previous line of the bundle or in a previous bundle).
In all other cases you may reference an entity before it is declared, as long as it is declared in the same bundle (this is the same restriction the Mu-spec places).
When using `-c`, float and double inline constants will be be considered unique (for the purposes of generating inline declarations) if there text is unique, e.g:
`%a = FADD <float> <float>1.1f <float>1.10f` will generate two constants, whereas when using `-r` it will only generate one (since 1.1 = 1.10).
Inline declarations will call `gen_sym` with names of the form `@__#` where '#' is a number, this is due to a limitation in Zebu (and possibly the mu-spec) were all top-level declarations
must have names. Bundle names will be generated in the form `@__bundle_#`. (so to be safe never write global names starting with a `__` in your input files).
\ No newline at end of file
This diff is collapsed.
......@@ -7,16 +7,18 @@ ir
;
topLevelDef
: '.typedef' nam=type '=' ctor=typeConstructor # TypeDef
| '.funcsig' nam=funcSig '=' ctor=funcSigConstructor # FuncSigDef
| '.const' nam=constant '<' ty=type '>' '=' ctor=constConstructor # ConstDef
| '.global' nam=globalVar '<' ty=type '>' # GlobalDef
| '.funcdecl' nam=func '<' sig=funcSig '>' # FuncDecl
| '.expose' nam=globalName '=' func callConv cookie=constant # FuncExpDef
| '.funcdef' nam=func 'VERSION' ver=name '<' sig=funcSig '>' '{' body+=basicBlock+ '}' # FuncDef
;
typeConstructor
: '.bundle' (NAME | GLOBAL_NAME) # BundleDef
| '.typedef' nam=topLevelName '=' ctor=typeConstructor # TypeDef
| '.funcsig' nam=topLevelName '=' ctor=funcSigConstructor # FuncSigDef
| '.const' nam=topLevelName '<' ty=type '>' '=' ctor=constConstructor # ConstDef
| '.global' nam=topLevelName '<' ty=type '>' # GlobalDef
| '.funcdecl' nam=func '<' sig=funcSig '>' # FuncDecl
| '.expose' nam=topLevelName '=' func callConv cookie=constant # FuncExpDef
| '.funcdef' nam=func
(('<' sig=funcSig '>' ('VERSION' ver=name)?) |
('VERSION' ver=name ('<' sig=funcSig '>')?))? '{' body+=basicBlock+ '}' # FuncDef
;
typeConstructor
: 'int' '<' length=intParam '>' # TypeInt
| 'float' # TypeFloat
| 'double' # TypeDouble
......@@ -52,23 +54,27 @@ constConstructor
;
type
: globalName
: nam=topLevelName
| typeConstructor
;
funcSig
: globalName
: nam=topLevelName
| funcSigConstructor
;
constant
: globalName
: NAME
| nam=name
| '<' ty=type '>' constConstructor
;
func
: globalName
: nam=topLevelName
;
globalVar
: globalName
: nam=topLevelName
;
basicBlock
......@@ -77,7 +83,7 @@ basicBlock
instName
: ('[' name ']')?
: ('[' nam=name ']')?
;
instResult
......@@ -147,12 +153,13 @@ destClause
;
bb
: name
: nam=name
;
value
: globalName
| localName
: NAME
| nam=localGlobalLocalName
| constant
;
typeList
......@@ -249,14 +256,23 @@ stringLiteral
: STRING_LITERAL
;
name // Must be declared before being used
: localGlobalName
| localName
topLevelName
: NAME
| globalLocalName
| globalName
;
name
: NAME
| localGlobalName
| localName
;
commInst: GLOBAL_NAME;
globalName : GLOBAL_NAME;
localName : LOCAL_NAME;
localGlobalLocalName: LOCAL_NAME; // A name that refers to a local entity if it's already defined, otherwise a global entity
globalLocalName: LOCAL_NAME; // A global name in local form
localGlobalName : GLOBAL_NAME; // A Local name in global form
// LEXER
......@@ -284,8 +300,10 @@ NAN_FP
: 'nan'
;
NAME : IDCHAR+
;
GLOBAL_NAME
: GLOBAL_NAME_PREFIX IDCHAR+
: GLOBAL_NAME_PREFIX IDCHAR+
;
LOCAL_NAME
......
// Contains stuff common to both RuntimeVisitor and CVisitor
#pragma once
#include <map>
#include <string>
#include <set>
#include <utility>
#include <exception>
#include <memory>
#include <limits>
#include <iostream>
#include <type_traits>
#include <sstream>
#include <gmpxx.h>
#include "muapi.h"
#include "parser/UIRBaseVisitor.h"
#include "parser/UIRParser.h"
#include "parser/UIRLexer.h"
using namespace std::literals;
// Returns 'v' seperated by ', '
template<typename T>
std::string list_to_string(T v) {
if (v.size() == 0)
return ""s;
std::string value = std::to_string(v[0]);
bool start = true;
for (auto val : v) {
if (!start)
value += ", "s;
else start = false;
value += std::to_string(val);
}
return value;
}
// Returns 'v' seperated by ', '
template<>
std::string list_to_string<std::vector<std::string>>(std::vector<std::string> v) {
if (v.size() == 0)
return ""s;
std::string value = v[0];
for (std::size_t i = 1; i < v.size(); i++)
value += ", "s + v[i];
return value;
}
mpz_class mask_64(UINT64_MAX);
bool error = false; // Was there a syntax error?
std::string primordial_name;
std::stack<std::string> parent_names {}; // The names of the parent entitys
std::string last_name = ""; // The last name that was read (in global form)
std::string bundle_name = ""; // the name of the current bundle
void generate_bundle_name() {
static std::size_t bundle_id = 0;
::bundle_name = "__bundle__"s + std::to_string(bundle_id++);
}
// This is used to annotate the return types of visitor functions
template<typename T = void> using Any = antlrcpp::Any;
\ No newline at end of file
......@@ -7,13 +7,9 @@
#ifndef NO_MU
#include "RuntimeVisitor.hpp"
std::map<std::string, MuID> Runtime::Visitor::globals {};
std::map<MuID, int> Runtime::Visitor::int_sizes {};
#endif
#include "CVisitor.hpp"
std::map<std::string, std::string> C::Visitor::globals {};
std::map<std::string, int> C::Visitor::int_sizes {};
using namespace std::literals;
......@@ -36,7 +32,7 @@ std::string file_name(const char* s) {
int show_usage() {
std::cerr << "usage: " << std::endl;
std::cerr << "\tmuc -rc [-f primordial-function] input_files... output_dir/boot_image_file" << std::endl;
std::cerr << "\tmuc -rcs [-f primordial-function] input_files... output_dir/boot_image_file" << std::endl;
return -1;
}
......@@ -67,8 +63,31 @@ int main(int argc, char* argv[]) {
n_input = argc - 3;
has_primordial = false;
}
::primordial_name = primordial;
if (argv[1] == "-r"s) {
if (argv[1] == "-s"s) {
for (int i = 0; i < n_input; i++)
{
std::ifstream input_file (input[i]);
antlr4::ANTLRInputStream input_stream(input_file);
UIRLexer lexer(&input_stream);
antlr4::CommonTokenStream tokens(&lexer);
tokens.fill();
UIRParser parser(&tokens);
parser.ir();
auto le = lexer.getNumberOfSyntaxErrors();
auto pe = parser.getNumberOfSyntaxErrors();
if (le || pe)
std::cerr << "ERROR: " << input[i] << " (lexer " << le << ", parser " << pe << ")" << std::endl;
}
if (::error)
return -1;
} else if (argv[1] == "-r"s) {
#ifndef NO_MU
std::string options = "init_mu --aot-emit-dir="s + output_dir;
......@@ -76,15 +95,11 @@ int main(int argc, char* argv[]) {
MuVM* mvm = mu_fastimpl_new_with_opts(options.c_str());
MuCtx* ctx = mvm->new_context(mvm);
Runtime::primordial_name = primordial;
std::vector<MuID> ids = {};
for (int i = 0; i < n_input; i++) {
std::vector<MuID> idi = runtime_compile(ctx, std::string(input[i]));
ids.insert(std::end(ids), std::begin(idi), std::end(idi));
}
for (int i = 0; i < n_input; i++)
runtime_compile(ctx, std::string(input[i]));
if (Runtime::error) {
if (::error) {
std::cerr << "ERROR: input is invalid" << std::endl;
return -1;
}
......@@ -93,7 +108,7 @@ int main(int argc, char* argv[]) {
ctx->handle_from_func(ctx, Runtime::primordial_id) : nullptr;
ctx->make_boot_image(ctx,
&ids[0], ids.size(), // whitelist
&Runtime::whitelist[0], Runtime::whitelist.size(), // whitelist
primordial_exp, nullptr, nullptr, // primordial
nullptr, nullptr, 0, //sym
nullptr, nullptr, 0, // reloc
......@@ -110,9 +125,8 @@ int main(int argc, char* argv[]) {
<< "#include \"muapi.h\"" << std::endl
<< "#include \"mu-fastimpl.h\"" << std::endl
<< std::endl
<< "#define G(id, name) global_ ## id" << std::endl
<< "#define L(id, name) local_ ## id" << std::endl
<< "#define T(id) temp_ ## id" << std::endl
<< "#define G(id, ...) global_ ## id" << std::endl
<< "#define L(id, ...) local_ ## id" << std::endl
<< std::endl;
// Function header
......@@ -126,15 +140,10 @@ int main(int argc, char* argv[]) {
<< "\tMuCtx* " << ctx << " = " << mvm << "->new_context(" << mvm << ");" << std::endl
<< "\tMuIRBuilder* " << irbuilder << ";" << std::endl << std::endl;
C::primordial_name = primordial;
std::vector<std::string> ids = {};
for (int i = 0; i < n_input; i++) {
std::vector<std::string> idi = c_compile(ctx, input[i], std::cout, irbuilder);
ids.insert(std::end(ids), std::begin(idi), std::end(idi));
std::cout << std::endl;;
}
for (int i = 0; i < n_input; i++)
c_compile(ctx, input[i], std::cout, irbuilder);
if (C::error) {
if (::error) {
std::cerr << "ERROR: input is invalid" << std::endl;
return -1;
}
......@@ -142,9 +151,9 @@ int main(int argc, char* argv[]) {
std::string primordial_exp = has_primordial ?
ctx + "->handle_from_func("s + ctx + ", "s + C::primordial_id + ")"s : "NULL"s;
Array_String id ("MuID", ids);
Array_String ids ("MuID", C::whitelist);
std::cout << "\t" << ctx << "->make_boot_image(" << ctx << ", " << std::endl
<< "\t\t" << id.value << ", " << id.size << ", " << std::endl
<< "\t\t" << ids.value << ", " << ids.size << ", " << std::endl
<< "\t\t" << primordial_exp << ", NULL, NULL, NULL, NULL, 0, NULL, NULL, 0," << std::endl
<< "\t\t\"" << output_file << "\");" << std::endl;
......
T__0=1
T__1=2
T__2=3
T__3=4
T__4=5
T__5=6
T__6=7
T__7=8
T__8=9
T__9=10
T__10=11
T__11=12
T__12=13
T__13=14
T__14=15
T__15=16
T__16=17
T__17=18
T__18=19
T__19=20
T__20=21
T__21=22
T__22=23
T__23=24
T__24=25
T__25=26
T__26=27
T__27=28
T__28=29
T__29=30
T__30=31
T__31=32
T__32=33
T__33=34
T__34=35
T__35=36
T__36=37
T__37=38
T__38=39
T__39=40
T__40=41
T__41=42
T__42=43
T__43=44
T__44=45
T__45=46
T__46=47
T__47=48
T__48=49
T__49=50
T__50=51
T__51=52
T__52=53
T__53=54
T__54=55
T__55=56
T__56=57
T__57=58
T__58=59
T__59=60
T__60=61
T__61=62
T__62=63
T__63=64
T__64=65
T__65=66
T__66=67
T__67=68
T__68=69
T__69=70
T__70=71
T__71=72
T__72=73
T__73=74
T__74=75
T__75=76
T__76=77
T__77=78
T__78=79
T__79=80
T__80=81
T__81=82
T__82=83
T__83=84
T__84=85
T__85=86
T__86=87
T__87=88
T__88=89
T__89=90
T__90=91
T__91=92
T__92=93
T__93=94
T__94=95
T__95=96
T__96=97
T__97=98
T__98=99
T__99=100
T__100=101
T__101=102
T__102=103
T__103=104
T__104=105
T__105=106
T__106=107
T__107=108
T__108=109
T__109=110
T__110=111
T__111=112
T__112=113
T__113=114
T__114=115
T__115=116
T__116=117
T__117=118
T__118=119
T__119=120
T__120=121
T__121=122
T__122=123
T__123=124
T__124=125
T__125=126
T__126=127
T__127=128
T__128=129
T__129=130
T__130=131
T__131=132
T__132=133
T__133=134
T__134=135
T__135=136
T__136=137
T__137=138
T__138=139
T__139=140
T__140=141
T__141=142
T__142=143
T__143=144
T__144=145
T__145=146
T__146=147
T__147=148
T__148=149
T__149=150
T__150=151
T__151=152
T__152=153
T__153=154
T__154=155
T__155=156
T__156=157
T__157=158
T__158=159
T__159=160
T__160=161
T__161=162
T__162=163
T__163=164
INT_DEC=165
INT_OCT=166
INT_HEX=167
FP_NUM=168
INF=169
NAN_FP=170
GLOBAL_NAME=171
LOCAL_NAME=172
STRING_LITERAL=173
WS=174
LINE_COMMENT=175
C_COMMENT=176
'.typedef'=1
'='=2
'.funcsig'=3
'.const'=4
'<'=5
'>'=6
'.global'=7
'.funcdecl'=8
'.expose'=9
'.funcdef'=10
'VERSION'=11
'{'=12
'}'=13
'int'=14
'float'=15
'double'=16
'uptr'=17
'ufuncptr'=18
'struct'=19
'hybrid'=20
'array'=21
'vector'=22
'void'=23
'ref'=24
'iref'=25
'weakref'=26
'tagref64'=27
'funcref'=28
'threadref'=29
'stackref'=30
'framecursorref'=31
'irbuilderref'=32
'('=33
')'=34
'->'=35
'NULL'=36
'EXTERN'=37
'['=38
']'=39
':'=40
'SELECT'=41
'BRANCH'=42
'BRANCH2'=43
'SWITCH'=44
'CALL'=45
'TAILCALL'=46
'RET'=47
'THROW'=48
'EXTRACTVALUE'=49
'INSERTVALUE'=50
'EXTRACTELEMENT'=51
'INSERTELEMENT'=52
'SHUFFLEVECTOR'=53
'NEW'=54
'NEWHYBRID'=55
'ALLOCA'=56
'ALLOCAHYBRID'=57
'GETIREF'=58
'GETFIELDIREF'=59
'PTR'=60
'GETELEMIREF'=61
'SHIFTIREF'=62
'GETVARPARTIREF'=63
'LOAD'=64
'STORE'=65
'CMPXCHG'=66
'WEAK'=67
'ATOMICRMW'=68
'FENCE'=69
'TRAP'=70
'WATCHPOINT'=71
'WPEXC'=72
'WPBRANCH'=73
'CCALL'=74
'NEWTHREAD'=75
'THREADLOCAL'=76
'SWAPSTACK'=77
'COMMINST'=78
'<['=79
']>'=80
'EXC'=81
'KEEPALIVE'=82
'RET_WITH'=83
'KILL_OLD'=84
'PASS_VALUES'=85
'THROW_EXC'=86
'ADD'=87
'SUB'=88
'MUL'=89
'UDIV'=90
'SDIV'=91
'UREM'=92
'SREM'=93
'SHL'=94
'LSHR'=95
'ASHR'=96
'AND'=97
'OR'=98
'XOR'=99
'FADD'=100
'FSUB'=101
'FMUL'=102
'FDIV'=103
'FREM'=104
'EQ'=105
'NE'=106
'SGT'=107
'SLT'=108
'SGE'=109
'SLE'=110
'UGT'=111
'ULT'=112
'UGE'=113
'ULE'=114
'FTRUE'=115
'FFALSE'=116
'FUNO'=117
'FUEQ'=118
'FUNE'=119
'FUGT'=120
'FULT'=121
'FUGE'=122
'FULE'=123
'FORD'=124
'FOEQ'=125
'FONE'=126
'FOGT'=127
'FOLT'=128
'FOGE'=129
'FOLE'=130
'TRUNC'=131
'ZEXT'=132
'SEXT'=133
'FPTRUNC'=134
'FPEXT'=135
'FPTOUI'=136
'FPTOSI'=137
'UITOFP'=138
'SITOFP'=139
'BITCAST'=140
'REFCAST'=141
'PTRCAST'=142
'NOT_ATOMIC'=143
'RELAXED'=144
'CONSUME'=145
'ACQUIRE'=146
'RELEASE'=147
'ACQ_REL'=148
'SEQ_CST'=149
'XCHG'=150
'NAND'=151
'MAX'=152
'MIN'=153
'UMAX'=154
'UMIN'=155
'#N'=156
'#Z'=157
'#C'=158
'#V'=159
'#DEFAULT'=160
'f'=161
'bitsf'=162
'd'=163
'bitsd'=164
'nan'=170
T__0=1
T__1=2
T__2=3
T__3=4
T__4=5
T__5=6
T__6=7
T__7=8
T__8=9
T__9=10
T__10=11
T__11=12
T__12=13
T__13=14
T__14=15
T__15=16
T__16=17
T__17=18
T__18=19
T__19=20
T__20=21
T__21=22
T__22=23
T__23=24
T__24=25
T__25=26
T__26=27
T__27=28
T__28=29
T__29=30
T__30=31
T__31=32
T__32=33
T__33=34
T__34=35
T__35=36
T__36=37
T__37=38
T__38=39