Overthinking Web Application URI Routing (PHP)

I just got around to implement URI routing for my PHP framework. Given a setup such as

001  Route::get('page/landing', 'controller#action');
002  Route::get('page/profile/:id', 'controller#action');
003  Route::get('page/profile', 'controller#action');
004  Route::post('page/profile/:id', 'controller#action');

I figured it would be a waste to match every possible route against the request's URI until a match was found. Considering that there is an implied tree structure:

It seemed smart to preprocess the routes into a tree structure, and save said structure into memory for future requests to match directly against the request URI.

I ended up just serializing the structure and saving it into a file. The ideal solution would be to write a php file that directly creates the object tree (without dynamically parsing the routes first) and enabling opcache for it. The algorithms took me a bit of thinking to work out, but that was the fun part. Originally I had methods to write API and WEB routes, much like Laravel, until I noticed that the approach didn't work well with the tree structure. Thankfully it's not a must-have kind of abstraction anyways.

001  private function prepare_data()
002  {
003      // use $this->routes to build tree data
004      // set $this->prepared routes
005      // write 'system/data/routes.text' from serialized tree
006  
007      $tokens = array(new FrameworkRouteToken('START'));
008  
009      foreach ($this->routes as $route) {
010          $current = $tokens[0];
011          foreach ($route->tokens as $tok) {
012              $found = false;
013              foreach ($current->next as $check) {
014                  if (FrameworkRouteToken::equals($tok, $check)) {
015                      $current = $check;
016                      $found = true;
017                      break;
018                  }
019              }
020              if (!$found) {
021                  array_push($current->next, $tok);
022                  $current = $tok;
023              }
024          }
025      }
026      // information for default route
027      if (FrameworkRoute::$default_target) {
028          $tok = new FrameworkRouteToken('DEFAULT');
029          $target = FrameworkRoute::$default_target;
030          $target = explode('#', $target);
031          $tok->controller_name = ucfirst($target[0]).'Controller';
032          $tok->action_name = $target[1];
033          array_push($tokens, $tok);
034      }
035      $this->prepared_routes = $tokens;
036      file_put_contents('system/data/routes.txt', serialize($tokens));
037  }
038  
039  public function match_route()
040  {
041      $params = array();
042      $uri_route = array();
043      array_push($uri_route, $this->request->method);
044      var_dump($this->request->uri_elements);
045      foreach ($this->request->uri_elements as $el) {
046          array_push($uri_route, $el);
047      }
048      $tokens = $this->prepared_routes[0]->next;
049      foreach ($uri_route as $key => $str) {
050          foreach ($tokens as $tok) {
051              if ($tok->wildcard_f) {
052                  if ($tok->match_condition) {
053                      if (preg_match("/".$tok->match_condition."/", $str)) {
054                          $params[$tok->name] = $str;
055                          if ($key === 
056                              array_key_last($uri_route) &&
057                              $tok->controller_name)
058                          {
059                              $this->match_controller =
060                                  $tok->controller_name;
061                              $this->match_action = $tok->action_name;
062                              break 2;
063                          } else {
064                              $tokens = $tok->next;
065                              break;
066                          }
067                      } else {
068                          continue;
069                      }
070                  } else {
071                      $params[$tok->name] = $str;
072                      if ($key ===
073                          array_key_last($uri_route) &&
074                          $tok->controller_name)
075                      {
076                          $this->match_controller = $tok->controller_name;
077                          $this->match_action = $tok->action_name;
078                          break 2;
079                      } else {
080                          $tokens = $tok->next;
081                          break;
082                      }
083                  }
084              } else {
085                  if ($tok->name == $str) {
086                      if ($key ===
087                          array_key_last($uri_route) &&
088                          $tok->controller_name)
089                      {
090                          $this->match_controller = $tok->controller_name;
091                          $this->match_action = $tok->action_name;
092                          break 2;
093                      } else {
094                          $tokens = $tok->next;
095                          break;
096                      }
097                  } else {
098                      continue;
099                  }
100              }
101          }
102      }
103      if (!$this->match_controller) {
104          $target = FrameworkRoute::$default_target;
105          $target = explode('#', $target);
106          $this->match_controller =
107              $this->prepared_routes[1]->controller_name;
108          $this->match_action = $this->prepared_routes[1]->action_name;
109      }
110  }

Anyways. I took inspiration from Crafting Interpreters that teaches using a trie structure for parsing keywords. That I ended up writing a tree instead was a naive oversight on my part. Although going with a tree instead was probably the right call considering that parameter nodes need additional checking anyways.

Long story short, I had fun making use of knowledge I obtained from reading through a programming book semi-casually. The ideal URI routing algorithm probably depends on an application's routes anyways. What I did is quite overkill for a small app with only a couple of possible routes, for example.

Also, writing this post certainly had a purpose because I just noticed that there's an ambiguity issue. Given a routes.php of this setup:

001  Route::get('page/profile/:id');
002  Route::get('page/profile/extended/:id');

The URI https://www.example.org/page/profile/extended/5 would attempt to match against the first route page/profile/:id setting $param['id'] = 'extended' and not finding a matching token for the 5 anymore.

But I suppose simply ordering the arrays to put wildcards last should circumvent this issue.