diff --git a/AGENTS.md b/AGENTS.md index b07a7fa..1e25849 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -20,6 +20,26 @@ - Keep file names lowercase with underscores (e.g., `src/builtins/echo/echo.c`). - Keep headers in `include/` and expose only what modules need. +## Parser & Lexer Functionality (Current `src/parser`) +- Runtime entrypoint is `parse(line, minishell)` from `src/minishell.c` (`readline -> parse -> execute`). +- `parse` splits the input line by unquoted `|` using `extract_next_command` + `find_boundary`. +- Each non-empty segment is trimmed and converted into a `t_command` via `cmdnew`. +- `set_argv` splits by unquoted spaces; quote characters are preserved in the resulting argument text. +- `expand_envs` is currently a TODO (no `$VAR` expansion is applied in parser stage). +- Redirections/heredoc are not converted into `t_command.redirections` yet in `src/parser/parser.c`. +- `set_path` resolves builtins and direct paths (`/`, `./`, `../`), otherwise searches `PATH` with `access(..., F_OK)`. +- `src/parser/lexer.c` provides a separate lexer (`lex`) that tokenizes into `TOKEN_WORD`, `TOKEN_PIPE`, `TOKEN_REDIRECT_IN`, `TOKEN_REDIRECT_OUT`, `TOKEN_APPEND`, and `TOKEN_HEREDOC`. +- The lexer tracks single/double quote context so metacharacters inside quotes remain part of words. +- Meta runs are read as contiguous chunks in `read_token` (for example, repeated `|`/`<`/`>` are captured as one token value). +- Current parser flow does not consume the lexer output yet. + +## Parser & Lexer Known Gaps +- `src/parser/parser.c` currently calls `tokenize()` with no valid declaration/definition in that unit, causing a build error with `-Werror`. +- `src/parser/parser.c` writes `command->infile` and `command->outfile`, but those fields are not present in `t_command` (`include/core.h`), causing build errors. +- `src/parser/parser.c` keeps a `tokens` variable that is unused, also failing under `-Werror`. +- `include/parser.h` exports `parse` only; `lex` is not declared in public headers. +- No explicit unmatched-quote syntax error handling is implemented in parser/lexer path. + ## Testing Guidelines - There is no automated test runner. Use manual checks in `docs/tests.md` and basic shell behavior checks (pipes, redirects, builtins). - When debugging memory issues, run under valgrind and use the suppression file in `valgrind/readline.supp`.